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1.0  INTRODUCTION 


This  document  presents  the  results  of  a  twelve  month,  6.1  research 
effort  sponsored  oy  RADC/IRRE  and  performed  at  VERAC,  Incorporated  in 
San  Diego.  The  primary  tnrust  of  the  study  involved  the  development  of 
image  cooing  tecnmques  oaseo  upon  tne  singular  value  decomposition 
( SVD)  operation,  and  intended  for  application  to  bandwidth  compression 
of  tactical  imagery.  An  important  aspect  of  the  study  was  a  thorougn 
comparison  of  the  new  SVD  approaches  to  other  transform  image  coding 
scnemes. 

Compression  algorithms  based  upon  four  distinct  image 
transformations  were  examined: 

•  Singular  Value  Decomposition, 

•  Karhunen-Loeve, 

•  Cosine,  and 

•  Haaamard. 

Tne  singular  value  decomposition  coding  algorithms  were  new,  the 
Karhunen-Loeve  coding  algorithms  were  extensions  of  previous  work,  and 
the  cosine  and  Hadamard  coding  algorithms  were  baselines  representative 
of  the  current  state  of  the  art  in  transform  image  coding. 

All  algorithms  were  designed  to  be  as  similar  as  possible,  both  in 
philosophy  and  implementation.  Differences  were  restricted  entirely  to 
the  particular  image  transformation  employed  in  each  case.  The  result 
was  a  common  framework  in  which  the  various  transformations  were 
evaluated  for  coding  efficiency  and  image  quality,  without  contamination 
by  performance  differences  that  can  arise  due  to  variations  in  other 
aspects  of  coder  implementation.  This  was  the  first  time,  to  our 
knowledge,  that  such  a  we  1 1 -control led  environment  was  established  for 
comparison  of  alternative  transform  image  coders. 
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Tne  stuay  consisted  of  three  efforts: 


•  Algorithm  Design, 

•  Software  Development,  ana 

•  Coder  Evaluation. 

This  document  is  primarily  concerned  with  describing  the  various 
algorithms  developed  under  the  study  and  summarizing  their  comparative 
performances  botn  among  themselves  and  with  respect  to  the  baseline 
algorithms.  The  software  developed  under  the  contract  to  implement  the 
coding  algoritnms  is  described  in  a  companion  report,  “Image  Compression 
Software  Documentation,"  VERAC  Technical  Report  No.  R-022-81. 

i.i  Summary  of  Results 

The  singular  value  decomposition  is  the  mathematical  transformation 
which  achieves  maximum  energy  compaction  into  the  fewest  number  of 
transform  coefficients,  called  singular  values  in  the  case  of  the  SVD. 
Thus,  tne  SVD  represents  a  potentially  very  useful  operation  for 
reducing  the  bandwidth  required  to  encode  image  data,  since  a  small 
number  of  singular  values  can  be  encoded  in  place  of  a  larger  number  of 
pixels.  The  SVD  achieves  this  efficient  compaction  by  tailoring  the 
transform  operator  —  called  singular  vectors  for  the  SVD  —  to  the 
image  data  itself.  The  price  for  this  tailoring  is  that  the  singular 
vectors  must  also  be  encoded  along  with  the  singular  values  to  permit 
the  decoder  to  perform  image  reconstruction. 

A  number  of  SVD-based  image  coding  algorithms  were  developed.  The 
variations  were  due  to  different  approaches  to  efficiently  coding 
singular  vectors.  The  result  was  an  assortment  of  SVD  coding  algorithms 
of  varying  complexity  which  were  identified,  implemented  and  evaluated. 

In  addition  to  the  SVD  algorithms,  a  class-adaptive  Karhunen-Loeve 
transform  ^KlT)  algorithm  was  also  developed  as  a  generalization  of  the 
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SVD  approach.  The  Dasic  idea  is  to  replace  the  SVD's  tailoring  of 
transform  operators  to  image  data  by  the  KLT's  tailoring  to  average 
image  characteristics .  Tne  result  is  a  reduction  in  the  number  of 

different  transform  operators  that  must  be  encoaed:  instead  of  one  for 

each  block  of  imagery,  one  for  each  class  of  imagery  is  now  required. 

The  price  for  this  improvement  is  a  concomitant  lessening  of  the  high 
energy  compaction  produced  by  the  SVD.  Two  versions  of  KLT  code  were 
developed,  one  depending  upon  explicit  training  on  image  data,  and  other 
computationally  simpler  but  based  upon  an  assumed  image  model. 

The  various  SVO  and  KLT  algorithms  were  evaluated  against  each 
other  as  well  as  against  the  baseline  algorithms,  which  employed  the 
fixed  (not  tailored)  cosine  and  Hadamard  transforms.  Evaluations  were 
performed  over  a  range  of  coding  rates,  extending  as  low  as  0.25  bits 

per  pixel  (bpp)  ana  as  high  as  1.5  bpp.  The  best  in  each  category  were 

identified  based  upon  a  preliminary  evaluation  using  a  small  set  of  test 
imagery.  Next,  four  algorithms  —  one  SVD,  one  KLT  and  the  cosine  and 
Hadamard  —  were  comprehensively  evaluated  against  a  larger  set  of  test 
imagery.  This  imagery  included  visible  and  IR  aerial  photographs  and 
SAR  imagery,  all  quantized  to  8  bpp. 

All  four  algorithms  performed  well  on  the  test  images  at  1.5  bpp. 
The  KLT  ana  cosine  algorithms  had  highest  coding  efficiency,  whereas  the 
Hadamard  algorithm  was  most  computationally  efficient.  Overall 
performance  —  jointly  considering  both  coding  and  computational 
efficiency  —  was  best  for  the  cosine  algorithm,  which  appeared  to 
perform  well  all  the  way  down  to  0.5  and  sometimes  0.25  bpp.  Despite 
the  intensive  effort  in  developing  the  most  efficient  SVD  coding 
algorithm  possible,  this  approach  was  found  to  be  inferior  to  the  cosine 
transform  coder. 

1.2  Roadmap 

The  remainder  of  this  report  presents  the  algorithms  developed 
under  this  study  and  the  results  of  evaluations  performed  to  compare 


tnese  algorithms  among  tnemselves  and  against  baseline  algorithms. 
Section  2  begins  by  defining  study  oojectives  and  scope.  Section  3 
presents  an  overview  of  the  transform  image  coding  approach  employed  by 
ail  the  algorithms  developed  and  tested.  Section  4  then  concentrates  on 
tne  detai  Is  of  the  KIT  algorithm  and  Section  5  upon  the  SVO  algorithms. 
Section  6  discusst'  the  mechanism  implemented  to  achieve  rate 
equalization  in  all  algorithms.  Section  7  next  presents  evaluation 
results.  Section  8  summarizes  the  study  conclusions,  and  Section  9  lists 
references.  A  variety  of  technical  details  which  support  various 
aspects  of  coder  algorithm  development  are  presented  in  Appendices  A 
through  F. 
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2.0  PROJECT  SCOPE  AND  OBJECTIVES 


The  intelligence  conmunity  and  the  Air  Force  have,  for  several 
years,  realized  the  important  role  that  image  compression  will  be 
required  to  play  in  various  image  exploitation  and  intelligence  systems 
of  the  future.  Full  use  of  the  potential  of  these  systems  implies  a 
need  to  transmit  and  store  enormous  quantities  of  digital  image  data. 

As  suggested  in  Figure  2-1,  image  compression  {and  associated 
decompression  or  reconstruction)  will  directly  impact  the  utilization  of 
tnese  systems  oy  cringing  storage  and  transmission  requirements  within 
technologically  feasible  bounds  of  transmission  and  storage  media. 

The  primary  focus  of  this  study  was  on  the  compression  of  single 
frame  tactical  imagery.  Such  imagery  arises  from  a  variety  of  imaging 
sensors,  including  those  sensitive  to  visible,  infrared,  and  microwave 
(radar)  wavelengths.  Applications  typicaly  include  intelligence, 
reconnaisance,  and  strike  assessment. 

We  differentiate  the  imagery  for  such  applications  from  the  TV-scan 
imagery  normally  associated  with  airborne  scanners,  trackers  or  target 
detectors/recognizers  and  used  in  weapon  fire  control.  In  our  case,  the 
imagery  tends  to  be  high  resolution,  with  large  area  coverage,  but  with 
relatively  long  revisit  times.  This  is  in  contrast  to  the  TV-scan 
imagery  which  is  typically  of  lower  resolution  and  smaller  field  of 
view,  but  with  revisits  at  video  rates.  The  effect  is  that  in  this 
stuoy  we  only  exploited  spatial  information:  temporal  redundancy  was  not 
available  for  use  in  compression. 

In  the  course  of  the  study,  we  concentrated  upon  tne  image 
compression  algorithms  themselves,  and  not  upon  the  particular 
implementation  to  specific  transmission  or  storage  applications.  In 
particular,  we  did  not  take  specific  channel  characteristics  into 
consideration,  but  instead  focused  on  the  inherent  performance 
properties  of  the  various  algorithms.  We  did,  however,  design  and 
investigate  algorithms  for  use  over  the  range  of  compression  ratios 
anticipated  as  characteristic  of  various  transmission  and  storage 
channels  of  potential  interest. 
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Our  primary  evaluation  tool  involved  rate  distortion  measures  which 
describe  coding  efficiency  in  terms  of  the  compression  rates  and  image 
oegraaations  that  result  from  application  of  the  various  coders.  A 
secondary  measure  was  the  computational  efficiency  associated  with  each 
approach. 

Image  compression  can  De  veiwed  as  a  coding  process  in  which  a 
compact  representation  of  the  image  is  extracted  which  is  sufficient  for 
suosequent  viewing  'r:ilysis.  The  efficiency  of  the  compression  is 
defined  by  tne  +taov*~  information  (measured  in  number  of  bits) 
necessary  for  w -..se.  One  measure  of  this  efficiency  is  the  image 
compression  rati  ft.;  ve-  Vied  as  the  ratio  of  the  number  of  bits 
representing,  th$  ?;*  ■?. nal  image  to  the  number  of  bits  in  the  coded 
representation.  An  alternative  is  the  compressed  rate,  defined  as  the 
ratio  of  the  numoer  of  bits  in  the  coded  representation  to  the  number  of 
pixels  in  the  original  image.  These  quantities  are  related  as  follows: 

compression  »  Borjq 
ratio  Bcodeo 

compressed  *  BCQQed 
rate  N-ft 


orginal  =  Bor-jq 
rate  N>m 

compression  =  orginal  rate 
ratio  compressed  rate 

where 

B _ =  number  of  bits  in  original  image 

orig  3  3 

^codea  “  numDer  of  bits  in  coded  representation 
N*M  «  number  of  pixels  in  orginal 
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2.1  Algorithms 


Tnere  are  a  variety  of  compression  algorithms  that  can  be  applied 
to  image  data.  In  this  study,  attention  was  restricted  to  a  class  of 
particularly  efficient  techniques  which  involvea  use  of  two-dimensional 
linear  transformations  of  image  data  prior  to  encoding.  Prominent  in 
tms  class  are  tne  well-known  20  cosine  and  Hadamard  transforms  which 
were  included  as  baseline  algorithms  [1].  Image  compression  based  up on 
these  transrorms  is  marked  by  both  computational  efficiency  (due  to  the 
existence  of  specialized  "fast11  algorithms]  and  coding  efficiency  (due 
to  good  "energy  compaction"  properties).  While  the  associated 
computational  efficiency  is  a  considerable  advantage,  the  fact  that 
these  transforms  are  not  specific  to  an  image,  or  at  least  to  a  class  of 
images,  does  suggest  that  these  transforms  produce  less  than  optimal 
cooing  efficiency. 

This  study  was  concerned  with  developing  and  evaluating  image 
transform  coders  employing  transformations  more  tailored  to  image 
characteristics.  Primary  focus  was  on  the  singular  value  decomposition 
( SVD)  operation,  due  to  its  known  property  of  producing  optimal  energy 
compaction.  Issues  concerning  both  coding  and  computational  efficiency 
were  addressed  and  are  reported  in  this  document. 

The  price  for  the  efficient  energy  compaction  of  the  SVD  is  that 
not  only  the  transform  coefficients  (singular  values)  themselves  but 
also  tne  transform  operators  (singular  vectors)  must  be  transmitted  or 
stored  in  order  to  permit  decoding.  In  order  to  reduce  this  load, 
averages  over  a  number  of  similar  images  can  be  taken  so  that  the 
operators  are  no  longer  image-specific,  but  rather  class-specific.  The 
result  is  the  class-adaptive  Karhunen-Loeve  transform,  which  was  also 
included  in  this  study. 

2.2  Imagery  of  Interest 

The  tactical  image  compression  applications  to  which  transform 
coders  are  targeted  possess  rather  stringent  compression  requirements. 
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Overall  compression  ratios  on  the  order  of  30:1  or  60:1  are  often 
necessitated.  In  order  to  achieve  such  ratios  and  still  maintain  useful 
image  qauality  in  image  regions  of  tactical  significance,  a  selective 
compression  algorithm  is  required.  Such  an  algorithm  employs  priority 
designations  of  various  image  regions  as,  for  example,  "high  interest", 
"low  interest"  or  "background".  Each  such  designation  carries  with  it 
the  requirement  for  a  different  level  of  compression.  For  example, 

"high  interest"  might  require  a  compression  ratio  on  the  order  of  only 
8:1,  whereas  "background"  might  have  to  be  compressed  down  to  60:1  or  so. 

The  idea  here  is  that  less  important  regions  are  assigned  a  greater 
snare  of  the  compression  buraen  than  are  more  important  regions.  The 
overall  achievement  of  large  compression  ratios  depends  upon  the 
predominance  of  less  important  ("background")  regions  within  imagery. 
Fortunately,  tactical  imagery  often  has  this  characteristic  [2]. 

In  this  study,  we  have  concentrated  on  the  more  difficult  to 
compress  "high  interest"  regions  of  images.  This  is  because  it  is  on 
such  data  that  transform  approaches  generally  perform  best,  yielding  the 
highest  coding  efficiency.  Additionally,  and  perhaps  more  importantly, 
we  focused  on  "high  interest"  image  regions  because  it  is  the  faithful 
rendition  of  such  regions  at  the  decoder  that  is  the  fundamental  raison 
d'etre  ot  tactical  image  collection,  transmission  and  exploitation 
systems. 

we  have  investigated  the  applicability  of  the  various  transform 
approaches  to  three  types  of  imagery: 

•  Visible  wavelength  aerial  photographs, 

•  Synthetic  aperature  radar  imagery,  and 

• 
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Infrared  framing  camera  photographs. 


All  imagery  was  originally  quantized  to  8  bits  per  pixel.  The  range  of 
compression  ratios  studied  extended  from  5:1  (1.5  bits  per  pixel)  to 
32:1  (0.25  bits  per  pixel).  The  nominal  ratio  used  for  comparison  was 
8:1  (1  bit  per  pixel). 

2.2.1  Post  Processing  for  Blockiness  Suppression 

A  fundamental  aspect  of  a  transform  coder  is  that  it  is  applied  to 
images  in  a  block-by-b lock  fashion.  When  such  a  coder  is  required  to 
operate  at  high  compression  ratios,  artifacts  can  appear  at  interblock 
bounaaries.  This  blockiness  occurs  because  the  coder  processes 
different  blocks  separately  and  because  adjacent  blocks  often  contain 
image  aata  sufficiently  different  that  when  severe  compression  is 
applied,  and  these  characteristics  bloom  out  over  the  entire  block, 
aiscontinuities  are  created  at  block  edges. 

This  blockiness  behavior  is  not  restricted  to  transform  coders,  and 
in  fact,  has  been  observed  in  the  operation  of  other  compression 
algorithms  as  well.  There  are  several  fixes  which  are  possible,  all 
amounting  to  various  restoration/enhancement  schemes.  For  example, 
selective  averaging  across  block  edges  can  substantially  reduce  the 
visual  impact  of  blockiness  as  well  as  the  mean  square  degradation  error 
[3].  Although  developed  for  spatial  domain  implementation,  such  an 
approach  also  has  an  equivalent  implementation  in  the  transform  domain, 
and  could  be  integrated  as  a  final  post-processing  step  with  any  of  the 
transform  coders  investigated  under  this  study. 

However,  we  have  avoided  such  post-processing  considerations,  and 
have  concentrated  instead  on  the  effect  of  coder  algorithm  operation 
alone.  This  permitted  a  cleaner  assessment  of  coder  performance,  and 
enhanced  our  ability  to  isolate  subtle  image  degradations  introducted  by 
various  alterations  in  coder  parameter  values.  Since  such 
post-processing  can  always  be  added  later,  overall  peformance  of  an 
eventual  coder  implementation  based  on  these  algorithms  was  not 
prematurely  compromised.  Introducing  it  at  this  early  stage  of 
algorithm  development  and  evaluation,  however,  would  have  merely 
degraded  our  ability  to  assess  algorithm  performance. 
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3.0  OVERVIEW  OF  TRANSFORM  IMAGE  COOING 


There  are  a  number  of  extant  approaches  to  compressing  single  frame 
imagery.  Eacn  approach  represents  a  particular  compromise  among  a  set 
of  conflicting  goals,  including: 

•  Maximize  compression, 

•  Minimize  degradation, 

•  Maximize  adaptivity, 

•  Minimize  encoder  complexity,  and 

•  Minimize  decooer  complexity. 

3.1  Image  Coding  Approaches 

Taoie  3-1  lists  six  categories  of  image  coding  approaches  along 
with  an  example  or  two  for  each.  The  simplest  is  PCM  (Pulse  Code 
Modulation)  wnich  is  simply  a  requantiziation  of  pixel  intensities. 

Such  an  approach  includes  companding  (CQMpressing  and  exPANding),  as 
well  as  adaptive  versions  that  amount  to  digital  automatic  gain 
control.  This  approach  is  the  least  complicated  to  implement,  and 
generally  produces  tne  least  compression  at  a  given  level  of  distortion. 

The  next  three  categories  —  predictive,  transform,  and 
interpolati ve/extrapolative  —  attempt  to  exploit  the  spatial 
redundancies  present  in  imagery.  Predictive  coding  utilizes  the 
observation  that,  in  high  resolution  imagery,  neighboring  pixels  tend  to 
have  similar  intensity  values.  This  information  is  used  to  encode  onl;, 
the  differences  between  pixel  values  and  estimates  of  these  values 
predicted  from  previously  encoded  pixels.  Since  these  differences  tend 
to  be  smaller  than  the  pixel  values  themselves,  fewer  bits  are  needed  to 
encode  them.  A  variety  of  versions,  including  schemes  that  are  fixed 
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Ta D 1 e  3-1.  Image  Coding  Approaches  for 
Compressing  of  Single  Frame  Imagery 


•  PCM 

-  Companding 

•  PREDICTIVE 

-  DPCM 

-  Delta  Modulation 

•  TRANSFORM 

-  Cosine 

-  SVD 

•  I NTERPOLAT I VE/EXTRAPOLAT I VE 

-  Subsampling  (e.g.,  MAPS) 

-  Splines 

0  OTHERS 

-  Contour 

-  Bit  Plane 

•  HYBRID 

-  Cosine/DPCM 


ana  aaaptive  ana  that  are  Dasea  on  10  ana  2D  prediction,  are  possible. 
This  approach  is  computationally  efficient  ana  performs  reasonably  well 
on  over-samplea  digital  imagery.  However,  it  exploits  only  part  of  the 
spatial  redunaancy  in  the  scene. 

Transform  approacnes  tend  to  perform  best  on  nign-resolution, 
moderate  dynamic  range,  critically  sampled  imagery.  This  approach  is 
oasea  on  diviaing  tne  image  into  blocks,  performing  a  mathematical 
transform  operation  on  each  block,  ana  encoding  the  resulting 
coefficients  l i j .  A  number  of  transforms  are  available,  including  the 
Fourier,  cosine,  sine,  Hadamard,  Haar,  slant,  Karhunen-Loeve,  and 
singular  value  decomposition.  This  set  spans  the  spectrum  of  coding  and 
computational  efficiency.  The  fundamental  idea  involved  in  transform 
coding  is  to  apply  a  transform  wnich  compresses  the  block  information 
into  a  small  number  of  coefficients  which  are  then  encoded  in  place  of 
tne  larger  numoer  of  pixel  values  themselves.  This  approach  exploits  2D 
redundancy  in  the  image,  but  only  within  the  boundaries  of  individual 
blocks.  Both  fixed  ana  adaptive  versions  are  possible. 

As  an  alternative  to  transform  approaches,  the  interpolate  ve/ex¬ 
trapolative  approach  attempts  to  fit  curves  to  the  two-dimens lcnal 
surface  defined  by  pixel  intensity  values.  Then,  only  the  par.,:*  ..rs  of 
the  curves  are  coded.  The  simplest  version  uses  piecewise  constant 
curves,  such  as  are  generated  by  CDC's  MAPS  (Micro  Adaptive  Processing 
System^  coder  L4].  More  ambitious  approaches  employ  higher  order 
splines  [5].  The  keys  to  the  success  of  these  types  of  scheme  are  their 
adaptivity  to  local  image  characteristics  and  their  operation  on  imagery 
containing  a  high  proportion  of  smooth  areas,  which  thus  permits 
parsimonious  (low  order  and  extensive  in  area)  curve  parameterizations. 

The  remaining  approaches  in  the  table  are  either  specializations  or 
combinations  of  the  foregoing.  For  example,  contour  or  bit  plane  coding 
is  based  on  binary  images,  and  the  cosine/OPCM  hybrid  combines  a  10 
transform  with  a  ID  predictive  coder. 
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In  tms  study,  transform  approaches  were  examined  exclusively.  New 
SVD  ano  KLT  algorithms  were  developed  and  compared  with  baseline  cosine 
and  Hadamard  transform  algorithms. 

3.2  Transform  Image  Coding 

Figure  3-1  illustrates  the  transform  image  coding  chain  used 
throughout  the  study.  The  first  step  involves  extracting  a  block  from 
tne  image,  wnicn  accomplishes  a  re  forma  ting  of  the  image  from 
raster-scan  into  block  ordering.  Following  this  is  the  input  intensity 
remapping  step,  wmch  performs  a  memory  less  transformation  of  the  image 
to  compensate  for  sensor  and  display  system  nonlinearites. 

•* 

The  next  step  is  the  application  of  the  20  transformation  to  the 
image  block,  creating  an  array  of  transform  coefficients  to  replace  the 
block  of  pixel  values.  Tnis  is  where  the  different  mathematical 
transforms  are  inserted  into  the  chain. 

After  conversion  to  transform  coefficients,  actual  encoding 
ensues.  This  is  the  step  that  performs  the  quantization  and  codeword 
assignment  that  constitutes  tne  encoding  of  image  information.  It  is 
the  quantization  part  of  this  operation  that  is  responsible  for  the 
deviations  of  a  coded  image  from  its  original,  by  irreversibly  degrading 
tne  image  representation:  the  coarser  the  quantization,  the  greater  the 
degradation  (but  the  greater  the  compression).  The  trick  is  to  perform 
this  quantization  efficiently,  i.e. ,  with  the  introduction  of  as  little 
degradation  as  possible. 

Next,  the  resulting  codewords  are  reordered  into  ID  form  and 
entered  into  the  channel  as  a  pit  stream.  Depending  upon  the 
application,  tne  channel  can  take  tne  form  of  a  storage  disk  or  magnetic 
tape,  or  a  digital  communicat ion  system  for  downlinking  data  from  a 
sensor,  for  relaying  to  an  exploitation  center,  or  for  dissemination  to 
users. 
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Whatever  the  application,  it  is  the  numDer  of  bits  exiting  the 
coefficient  cooer  ana  entering  the  channel  that  describes  the  coder 
efficiency,  either  in  terms  of  compression  ratio,  or,  the  measure 
preferred  here,  compressed  rate  (measured  in  bits  per  pixel). 

Mote  that,  invariably,  the  cnannel  includes  its  own  (channel) 
coder/decoder  or  modulator /demodulator  (modem)  which  adds  redundancy  for 
error  protection.  Examples  are  the  parity  bits  written  onto  tape  or  the 
Durst  error  codes  used  in  noisy  cotrmuni cation  systems.  In  any  case, 
this  redundancy  is  excluded  from  the  coder  efficiency  measures  employed 
in  this  report,  i.e.,  we  are  describing  source  coder  performance  only. 

We  are  not  concerned  with  channel  coder  performance,  since  the 
particular  channel  coder  required  in  any  situation  is  application- 
dependent. 

The  elements  in  tne  chain  following  the  channel  constitute  the 
decoding  operation  and  hence  reverse  the  operation  of  the  various  steps 
applied  before  the  channel.  Coefficient  decoding  extracts  the 
appropriate  bit  patterns  from  the  bit  stream,  interprets  them  as 
codewords,  and  reconstructs  the  transform  coefficients  from  the  coded 
information.  This  reconstruction  is  not  exact,  however,  due  to  the 
quantization  error  introduced  during  the  the  coefficient  encoding 
operation.  For  this  reason,  the  reconstructed  coefficients  are  not 
identical  to  the  original  coefficients  computed  during  encoding.  They 
are,  however,  the  best  available  estimates  of  these  coefficients  based 
on  encoded  data. 

Next,  the  reconstructed  coefficients  are  passed  through  the  inverse 
transformation,  producing  reconstructed  pixel  values.  Finally,  an 
output  intensity  remapping  is  applied  to  match  the  gray  scale  output  to 
the  display  system  characteristics,  and  the  block  is  re-inserted  into 
the  image  in  the  appropriate  location. 

3.3  81ock  Transformations 

There  are  two  underlying  reasons  for  applying  a  20  transformation 
to  a  block  of  image  data: 
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To  exploit  spatial  redundancy,  and 


• 

•  To  concentrate  information  in  a  small  number  of  coefficients. 

The  first  of  these  reasons  means  that  the  correlation  in  intensity 
values  of  closely  spaced  pixels  should  be  exploited.  The  objective  is 
to  generate  a  set  of  coefficients  as  uncorrelated  as  possible,  with  some 
of  these  describing  gross  image  structure,  some  medium-sized  features, 
and  some  fine  detail.  In  this  way,  the  degree  of  degradation  introduced 
by  quantization  into  any  level  can  be  accounted  for  separately.  For 
example,  since  many  images  have  many  blocks  with  very  little  important 
fine  detail,  those  coefficients  can  be  neglected  —  that  is,  not  encoded 
—  with  little  loss  of  information.  Additionally,  the  least  mean  square 
image  degradation  is  produced  in  those  cases  where  the  coefficients  are 
completely  uncorrelated.  This  also  motivates  obtaining  a  transform 
which  decorrelates  pixels  as  much  as  possible  prior  to  coefficient 
encoding. 

The  second  objective  concerns  concentrating  the  block's  energy  into 
a  small  number  of  coefficients.  In  other  words,  the  smaller  the  subset 
of  coefficients  that  have  appreciable  size,  the  smaller  the  number  of 
coefficients  which  must  be  coded  for  faithful  image  representation.  But 
not  only  is  the  number  of  large  coefficients  important,  so  also  is  the 
consistency  of  their  location  within  the  coefficient  array.  Thus, 
transforms  which  consistently  produce  very  small  coefficient  values  in 
certain  fixed  locations  permit  having  those  coefficients  consistently 
ignored  by  the  coefficient  coder. 

3.4  Block  Size 


Image  transform  coding  oeperates  on  images  a  block  at  a  time,  so 
that  the  question  of  appropriate  block  size  immediately  arises.  There 
are  several  issues  involved  in  selecting  block  size,  since  blocks  with 
the  following  properties  are  required: 

•  Small  enough  for  computational  efficiency, 
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Large  enough  for  substantial  decorrelation. 


• 

•  Small  enough  for  local  adaptivity,  and 

•  Power  of  two  for  "fast"  algorithms. 

The  first  of  these  objectives  stipulates  a  reasonable  block  size  for 
implementation.  Both  calculation  time  and  storage  space  requirements 
grow  with  block  dimensions.  Consequently,  it  is  necessary  to  keep  these 
demands  to  a  reasonable  level.  Based  on  experience  with  the  2D  cosine 
transform,  a  maximum  block  size  of  32  X  32  is  indicated  [6]. 

The  objective  of  decorrelating  pixels  implies  that  blocks  should  be 
as  large  as  possible,  since  a  transform  is  only  able  to  decorrelate 
pixels  within  a  Dlock.  ’’.o  decorrelation  of  pixels  In  distinct  blocks  is 
obtained.  Based  again  on  the  2D  cosine  transform,  a  minimum  size  of 
8  X  8  is  indicated  for  achieving  appreciable  decorrelation.  (This 
finding  is  based  on  critically  sampled  imagery  with  a  spatial 
correlation  coefficient  of  approximately  p  =  0.9  [6].) 

The  third  objective,  for  local  adaptivity,  implies  that  the  block 
size  should  be  small  enough  so  that  radically  different  image  structure 
aoes  not  appear  within  the  same  block.  The  motivation  for  this 
requirement  is  based  on  cases  where  a  small  subregion  of  fine  structure 
and,  hence,  high  interest,  is  imbedded  in  an  otherwise  flat  surround. 

If  the  busy  subregion  occupies  too  small  a  portion  of  the  block,  its 
effect  on  the  transform  coefficients  is  small  with  respect  to  that  of 
the  flat  surround.  Hence,  the  important  coefficients  are  small  and  may 
therefore  fail  to  be  encoded  accurately,  if  at  all.  Based  again  on  the 
cosine  transform  and  critically  sampled  imagery,  objectives  2  and  3  — 
for  high  decorrelation  and  local  adaptivity  —  balance  each  other  out  at 
a  size  of  approximately  16  x  16  [6]. 

Since  16  is  in  fact  a  power  of  two,  the  block  size  used  throughout 
the  study  for  all  algorithms  developed  and  compared  was  16  x  16. 

However,  several  notes  are  in  order: 
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•  Optimal  block  size  may,  in  fact,  vary  with  the  particular 
transform  employed.  Sixteen  by  sixteen  is  indicated  for  the 
cosine  transform,  but  was  also  adopted  for  the  other 
transforms  in  order  to  provide  a  consistent  basis  for 
comparative  performance  evaluation. 

•  Optimal  clock  size  definitely  depends  upon  the  spatial 
sampling  frequency.  Sixteen  by  sixteen  was  predicated  upon 
application  to  critically  sampled  raster  imagery  ( i . e . ,  at  or 
near  the  Nyquist  rate  in  each  dimension).  Significantly 
oversampled  imagery  would  probably  require  larger  block  sizes 
to  achieve  the  same  degree  of  pixel  decorrelation. 

t  There  is  no  law  requiring  square  blocks.  In  fact,  past 

studies  have  indicated  a  degree  of  relative  insensitivity  to 
blocx  aspect  ratio,  as  long  as  the  total  number  of  pixels 
remains  constant.  Non-square  blocks  can  arise  naturally  in 
imagery  obtained  from  sensors  utilizing  non-square  pixels 
(e.g.,  the  common  mod  FlIR).  We  employed  square  blocks  as  a 
default,  in  the  absense  of  reasons  to  adopt  non-square  blocks 
for  the  imagery  of  interest. 

3.5  Unitary  Transforms 

Suppose  we  denote  a  block  of  image  data  by  the  symbol  X.  Based 
upon  our  adopted  block  size  of  16  x  16,  X  represents  a  16  x  16  matrix  of 
pixel  intensities.  A  linear  transformation  of  X  can  be  represented  by: 

l  =  T(X) 

where  Z  represents  the  transform  coefficients  collected  into  a  second  16 
x  16  matrix.  The  transformation  T  is  linear,  implying  that 

T(X1  +  X2)  -  T(X1)  ♦  T(X2),  and 
T (aX )  -  aT(X). 
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Linear  transformations  T  are  by  far  the  most  practical  for  image  coding 
applications,  due  to  their  easy  implementation  with  respect  to  general 
nonlinear  transformations. 


However,  even  restricting  T  to  be  linear  does  not  guarantee  a 
useful  or  easily  implementable  transformation.  Further  restricting  T  to 
be  in  the  class  of  separable  unitary  transformations  does,  however.  A 
separable,  unitary  transform  has  the  following  form: 

Z  =  l^XV 

in  which  the  coefficient  array  Z  is  obtained  by  premultiplication  of  the 
pixel  array  X  by  the  matrix  U1,  and  postmultiplication  by  V. 

Furthermore,  the  transformation  matrices  U  and  V  are  unitary: 

UlU  =  UU1  =  I 

VlV  =  VVl  =  I 

Figure  3-2  illustrates  the  structure  of  both  the  forward  and  inverse 
separaDle  unitary  transform. 

The  advantage  of  such  a  transformation  is  that  it  possesses  the 
following  characteristics: 

•  Column/row  separable, 

•  Easy  to  invert,  and 

•  Norm  preserving. 

Column/row  separability  obtains  because  the  columns  and  rows  of  X  are 

transformed  separately:  The  Ut  multiplication  effects  a  column 

transformation  while  the 'V  multiplication  effects  a  row  transformation. 

3 

The  result  is  that  Z  is  obtained  by  applying  2n  operations,  where  n 
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Z  =  UtXV 

U,  V  Orthogonal  Matrices 


Forward  Transform 


TRANSFORM  IMAGE 

COEFFICIENTS  BLOCK 

X  =  UZV* 


Because  U" ]  =  uj: 
V'  =  V 


Inverse  Transform 


Figure  3-2.  Unitary  Block  Transform 


« 


represents  the  sixe  of  X,  i.e.,  X  is  n  x  n.  vn’s  is  a  great  savings 
4 

over  tne  n  operations  required  for  a  non-sepa. ^dle  linear 
transformation. 

Easy  inversion  ootains  because  both  U  and  V  are  unitary  and 
therefore  their  inverses  are  equal  to  their  transposes: 


(u^1  =  U 


Thus,  both  forward  and  inverse  transformation  entail  the  same  amount  of 
work  and  utilize  the  same  operators  U  and  V. 

Norm  preservation  again  is  a  consequence  of  the  unitary  character 
of  U  and  V.  What  it  implies  is  that  energy  calculations  can  be  applied 
in  either  tne  pixel  or  coefficient  domain.  Specifical ly,  if  Z  =  [z . 
and  X  =  [x.  j],  then: 

n  n 

£  l2  -  £  x2  . 

1,j«l  1j  ij 

This  property  is  extremely  important  in  devising  and  analyzing 
coefficient  coding  schemes.  For  example,  if  zpq  is  small  and  is 
neglected  (i.e.,  not  coded  and  then  approximated  by  zero)  the  effect  in 
the  pixel  domain  can  be  predicted  as  a  decrease  in  signal  energy  by 


The  effect  of  a  separable  unitary  transformation  can  best  be 
explained  by  considering  basis  blocks.  First  adopt  the  notation: 


X  =  [xid] 

U  =  [uj  u2 


Cli  v2 


•  v  ] 
-nJ 


.which  aepects  the  elements  of  the  coefficient  array  X  and  the  columns  of 
the  transformation  matrices  U  and  V.  Then  the  inverse  transform  can  be 
expanded  as  follows: 


UZVt 

r  tl 

*1  V'y n]  • 

’zn  Z12 '  zln  " 
Z21  z22  z2n 

• 

>t 

n 

2  Z  U  V* 

-znl  zn2  znn- 

— i 

..  ">f 

_ i 

i.j=l  ij  i  t 


Thus,  the  pixel  block  X  is  given  as  a  weighted  sum  of  rank  one  matrices 


Each  rank  one  matrix  u .  v_*j  represents  an  elementary  image 
block  called  a  basis  block.  Together  they  constitute  the  fundamental 
components  from  which  the  overall  X  is  constructed.  In  general,  there 
are  n  such  basis  blocks,  which  are  weighted  according  to  the 
corresponding  coefficient  values  z, .  and  combined  to  form  X.  The 

J 

coefficient  z.  .  thus  represents  the  strength  of  basis  block 
t  ^ 

li-jlj  contained  in  X.  If  the  basis  blocks  are  known  to  the 
decoder,  only  the  coefficient  values  z.  .  need  be  encoded  into  the 

1  J 

channel.  The  decoder  can  then  reconstruct  the  image  block  X  via  an 
inverse  transformation  of  Z  via  X  =  UZV*". 

3.6  Applicable  Transformations 

There  are  a  number  of  separable  unitary  transformations  which  can 
be  applied  for  image  compression.  These  generally  can  be  classed  in  one 
of  three  catagories: 

•  Fixed, 

e  Tailored  to  statistics,  or 


•  Tailored  to  block  itself. 


Fixed  transformations  have  received  the  greatest  amount  of 
attention  for  application  to  image  coding.  These  comprise  2D  extensions 
of  familiar  ID  unitary  transformations  and  are  characterized  by  fixed 
operators  U  and  V.  They  include: 

•  Fourier, 


Cosine, 


Hadamard, 


Haar,  and 


Slant. 


The  first  three  of  these  employ  sinusoidal  basis  functions  (i.e., 
the  columns  of  U  and  V  are  sampled  sinusoids),  whereas  the  last  three 
employ  square  wave,  tertiary  or  triangular  wave  basis  functions.  A 
primary  advantage  of  using  the  fixed  type  of  transformation  is  its  ease 
of  implementation,  often  by  a  "fast"  algorithm.  The  primary 
disadvantage  is  that  these  transformations  are  not  sensitive  to  changes 
in  local  image  characteristics,  and  so  may  work  much  better  on  some 
image  blocks  than  on  others. 


The  goal  of  adapting  the  transformation  to  local  image 
characteristics  motivates  consideration  of  the  remaining  two  tailored 
types  of  transformation.  The  first  of  these,  which  adjusts  the 
operators  U  and  V  to  local  image  statistics,  is  best  represented  by  the 
Karhunen-Loeve  transform,  which  is  sensitive  to  second  order  block 
statistics.  The  second  type  of  adaptive  transform  varies  with  the  block 
data  itself,  and  is  best  represented  by  the  singular  value 
decomposition,  in  which  the  U  and  V  operators  depend  upon  the  image 
block  X  itself. 
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3.6.1  Optimal  Decorre latinq  Transform 

To  better  appreciate  the  interrelationships  among  these  three 
transform  types,  a  statistical  viewpoint  is  helpful.  In  this  viewpoint, 
an  image  can  often  be  reasonably  modeled  as  a  sample  from  a  spatially 
correlated,  discrete  random  field.  If  the  additional  assumptions  that 
the  image  is  (spatially)  stationary  and  Gaussian  are  included,  Shannon 
theory  indicates  that  optimal  compression  (least  distortion  for  a  given 
compression  rate)  can  be  achieved  by  first  applying  a  decorrelating 
transform  to  convert  the  correlated  pixels  to  a  set  of  uncorrelated 
random  variables,  followed  by  encoding  the  resulting  uncorrelated  random 
variables  with  a  memoryless  coder. 

The  transform  wnich  is  statistically  optimal  for  decorrelating  a 
block  from  a  stationary  image  is  the  Hotelling,  or  discrete 
Karhunen-Loeve,  transform.  When  the  image  has  a  separable  covariance 
function,  this  transform  takes  the  form 

z  =  u  lxv 

where  U  and  V  are  determined  from  the  image  covariance  function.  For 
this  transform,  Z  is  an  array  of  completely  uncorrelated  random 
variables. 

For  two  primary  reasons,  technical  effort  has  historically  been 
directed  away  from  the  optimal  transform  and  focused  instead  on  other 
transforms  which  only  approximate  the  optimal  decorrelating  transform: 

• 

t  No  fast  algorithm  generally  exists  for  performing  the 
transform,  and 

•  The  procedure  for  deriving  the  Karhunen-Loeve  transform 

involves  potentially  erroneous  assumptions  about  the  image 
model  itself,  resulting  in  difficulties  with  specific 
applications. 


i  VCR AC 


Imwjjjljj 


3.6.2  Cosine  Transform 


Historically,  the  first  suboptimal  transform  to  be  considered  was 
the  discrete  Fourier  transform,  in  which  U  and  V  take  the  familar  form 
of  sampled  complex  sinusoids  [7].  A  prime  motivation  for  using  this 
transform  is  the  fact  that  as  the  block  size  grows  (again,  under  the 
stationarity  assumption),  the  Fourier  transform  approaches  the  optimal 
transform  in  the  mean-squared  sense.  Of  more  practical  concern  are  the 
facts  that  the  Fourier  transform  produces  a  (complex)  coefficient  array 
Z  which  is  highly  (though  not  perfectly)  uncorrelated,  and  that  a  fast 
implementation  (the  FFT)  exists. 

However,  a  problem  basic  to  use  of  this  transform  in  coding  is  the 
Gibbs  phenomenon,  which  results  in  severe  artifacts  near  the  edges  of 
tne  compressed  array,  ana  tnus  introduces  objectionable  blocking  in 
images  that  are  block  transformed.  This  latter  problem  can  be 
eliminated  by  introducing  a  forced  symmetry  into  the  block,  resulting  in 
the  cosine  transform  [8J.  For  this  transform,  U  and  V  are  sampled  real 
sinusoids,  and  the  coefficients  Z  are  themselves  all  real.  Because  of 
its  direct  relationship  to  the  Fourier  transform,  the  cosine  transform 
retains  the  Fourier  transform's  optimal  asymptotic  behavior,  and  is  in 
fact  superior  to  the  Fourier  transform  for  decorrelating  smaller  sized 
blocks.  In  addition,  the  FFT  can  still  be  used  in  actually  executing 
the  transformation. 

A  key  property  of  the  cosine  transform  which  makes  it  particularly 
attractive  for  image  compression  is  the  energy  compaction  into  the  lower 
frequency  coefficients  that  occurs  for  most  images.  Consequently,  by 
concentrating  on  transmitting  the  larger  magnitude,  generally  lower 
frequency  coefficients,  efficient  coding  with  only  slight  loss  of  image 
energy  is  possi'  le  [9]. 

3.6.3  Hadamard  Transform 

The  Hadamard  transform  is  a  binary  approximation  to  the  cosine 

transform  that  is  characterized  by  unitary  matrices  U  and  V  all  of  whose 

1 

\ 
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elements  are  either  1  or  -1.  An  alternate  interpretation  is  that  the 
columns  of  U  are  samplea  Walsh  functions,  so  that  this  transform  is  also 
known  as  tne  Walsh  transform. 

The  major  advantage  of  the  Hadamard  transform  is  its  ease  of 
implementation.  Not  only  are  multiplications  eliminated  in  forming  the 
coefficient  array  Z,  Put  a  fast  algorithm  akin  to  tne  FFT  also  exists  to 
speea  execution.  The  price  of  this  efficiency  is  a  degradation  in  the 
decorrelational  properties  of  the  transform  relative  to  the  cosine 
transform.  Even  so,  the  transform  does  a  fairly  good  job  of 
decorrelating  images  and  of  compacting  energy  into  the  lower  "sequency" 
coefficients  of  Z  [10]. 

3.6.4  Singular  Value  Decomposition 

Up  to  this  point,  the  transforms  discussed  nave  been  linear, 
separaole  and  unitary,  tnat  is: 

Z  =  l^XV. 

For  stationary  Gaussian  images  with  separable  covariance  functions, 
theory  indicates  that  this  structure  provides  for  efficient 
decorre lation  of  X  into  Z.  However,  for  images  which  are  nonstationary 
or  non-Gaussian  or  which  have  nonseparable  covariance  functions,  it  is 
possible  that  a  more  general  transform  than  that  above  could  produce 
better  results. 

One  such  generalization  is  a  nonlinear  transform  that  is  an 
image-adaptive  version  of  those  discussed  above: 

Z  =  Ut(X)  X  V(X) 

where  U  and  V  are  again  unitary.  Among  transforms  of  this  class,  the 
best  candidate  in  terms  of  energy  compaction  is  the  singular  value 
decomposition  (SVO): 

Z  =  u'xv 

VCRflC 
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where 


XXtU  =  UA,  ana 
XtXV  =  VA  . 


A  diagonal 


1/2 

In  tms  case,  Z  =A  so  that  Z  has  at  most  n  non-2ero  entries,  in 

2 

contrast  to  the  n  entries  of  the  optimal,  2D-cosine,  and  Hadamard 
transforms. 


Tne  major  property  of  this  transform  relevant  to  image  compression 

1 12 

is  that  this  choice  of  U  and  V  yields  a  Z(=A  )  with  maximum  energy 

compaction.  However,  unlike  the  previous  transforms,  this  U  and  V 

1  /2 

depena  upon  X,  so  that  it  is  necessary  to  transmit  not  only  Z  =  A  7  ,  but 
also  U  and  V.  Consequently,  it  is  perhaps  better  to  represent  this 
nonlinear  image  transformation  as: 


SVD(X)  =  (A,  U,  V). 


p 

Although  there  are  altogether  2n  ♦  n  non-zero  entries  in  the 

arrays  A,  U,  and  V,  a  degrees-of-freedon  analysis  indicates  that  a 

2  2 

total  of  n  numbers — n  for A  and  n  -n  for  U  and  V  together  —  are 
sufficient  r0  completely  specify  all  three  arrays. 

3.6.5  Focus  of  New  Developments 

Primary  attention  in  this  study  was  aimed  at  further  developing  the 
SVD  approach  to  image  coding.  A  small  amount  of  previous  work  using 
SVD's  for  image  compression  was  reported  in  [11],  but  the  results  are 
preliminary  and  do  not  take  into  account  the  image  statistics,  the 
regularity  of  the  singular  vectors  (columns  of  U  and  V),  or  the 
potential  efficiencies  that  can  be  obtained  by  jointly  considering  the 
transform  and  memoryless  coding  processes.  It  was  these  aspects  of  SVD 
coding  which  were  examined  in  the  course  of  this  study  in  developing  an 
optimal  SVD  image  coder. 
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In  addition,  recognition  of  the  high  overhead  necessitated  by 
singular  vector  coding  prompted  examination  of  a  different  approach 
which  reduced  tnis  overhead  by  amortizing  it  over  a  number  of  image 
blocks.  The  mechanism  for  accomplishing  this  was  the  implementation  of 
a  cooing  scheme  in  which  the  U  and  V  operators  are  specific  to,  instead 
of  a  single  block  of  image  data,  a  collection  of  such  image  blocks. 

Since  this  scheme  amounts  to  a  class-adaptive  Karhunen-Loeve  coder,  such 
an  algorithm  was  also  developed  for  comparison. 
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4.0  CLASS-ADAPTIVE  KARHUNEN-LOEVE 
TRANSFORM  CODER 


This  section  describes  the  class-adaptive  Karhunen-Loeve 
transformation  and  the  associated  pre-processing  and  coefficient  coding 
schemes  employed  with  it  under  the  study.  Subsection  4.1  covers  the 
transformation.  Subsection  4.2  the  preprocessing,  and  Subsection  4.3  the 
coefficient  coder. 


4.1  Class-Aaaptive  Karhunen-Loeve  Transformation 


The  Karhunen-Loeve  Transformation  (KLT)  is  the  method  of  expansion 
by  principle  statistical  components.  That  is,  it  involves  the 
representation  of  an  image  block  X  as  a  weighted  sum  of  basis  blocks 
B.  .  which  reflect  statistically  significant  block  characteristics. 

*  J 

This  representation  takes  the  form 


x-?jVu 


where  the  z.^  constitute  the  KLT  coefficient  array  Z.  (This 
expression  represents  the  inverse  KLT  operation.)  The  KLT  coefficient 
array  Z  possesses  two  important  properties: 


•  The  elements  of  Z  are  uncorrela.ed,  and 

•  The  average  energy  compaction  into  the  first  few  elements  of 
Z  is  greater  than  that  obtained  from  any  other  linear 
transformation. 


4.1.1  The  Separable  Covariance  Assumption 

In  order  that  the  KLT  be  implementable  as  a  separable  operation  on 


Z  -  uSv, 
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the  basis  blocks  B-j  above  must  take  the  form  of  the  outer  product  of 
two  vectors,  specifically: 


This  situation  obtains  if  it  is  assumed  that  the  image  covariance 
function  is  separable,  i.e.,  if  the  correlation  of  the  two  pixels  x ( i , j ) 
ana  x(i+ai , j+aj)  depends  not  on  the  Euclidian  separation  \/AiZ+aj  , 
but  separately  on  the  vertical  separation  &i  and  the  horizontal 
separation  aj.  Mathematically,  this  can  be  written  as 

COV  ui.AJl  =  Cv(ai )  CH(aj) 

where  Cv  is  the  vertical  image  covariance  and  is  the  horizontal 
covariance. 

The  significance  of  such  an  assumption  is  illustrated  graphically 
in  Figure  4-1,  which  shows  a  typical  radi ally-symmetric. image  covariance 
function  in  part  (a)  and  a  separable  approximation  to  it  in  part  (b). 

The  effect  of  the  approximation  is  to  over-accentuate  image  correlation 
vertically  and  horizontally  and  under-accentuate  it  at  oblique  angles. 
Thus,  vertical  and  horizontal  image  structure  can  be  expected  to  be 
retained  somewhat  more  faithfully  than  oblique  image  structure  when  KLT 
coefficient  coding  is  performed.  However,  the  cost  of  implementing  a 
KLT  based  on  a  non-separable  image  model  is  prohibitive  (an  order  of 
magnitude  more  calculation).  Consequently,  we  adopted  the  separable 
moael  for  derivation  of  the  KLT  operators. 

4.1.2  KLT  Definition 


Based  on  the  separable  covariance  assumption,  it  is  shown  in 
Appendix  A  that  a  separable,  unitary  KLT  transformation  takes  the 
fol  lowing  form: 

Z  =  U'XV 
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where  the  U  and  V  operators  are  unitary  matrices  that  depend  upon 
vertical  and  horizontal  pixel  correlations,  respecti vely.  Specifically, 
tne  U  and  V  matricies  are  pre-computed  from  pixel  statistics  according 
to  the  following  pair  of  eigenvalue/eigenvector  problems: 

*  i* 

|_J_  Crow  J  U  =  UArow,  and 
j_J_  Cco1  j  V  =  VAco1 


p ow  col  9 

where  C  ana  C  are  row  and  column  covariance  matrices,  o  is 

pixel  variance,  and  Arow  and  Aco^  are  diagonal  matrices.  Since  the 

resulting  U  and  V  are  unitary,  and  V~^=Vt  so  that  these 

problems  can  be  rewritten  as 


1 

X 

a  n 


U  =  A 


row 


,  and 


*  =  AC0' 


which  shows  that  the  effect  of  the  operators  U  and  V  is  to  diagonalize 
the  row  and  column  covariance  matrices.  The  result  is  that,  in  the  KLT, 
U  and  V  remove  row  and  column  correlations,  respectively,  from  the  pixel 
array  X,  producing  an  uncorrelated  coefficient  array  Z.  Maximum  energy 
compaction  into  the  with  the  smallest  indices  is  achieved  simply 

by  ordering  the  columns  of  U  and  V  so  that  the  diagonal  elements  of 
Arow  and  Aco^  monotonical ly  decrease  from  upper  left  to  lower  right. 
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4.1.3  .  Class  Aoaptivit, 


The  success  of  the  KLT  depends  upon  having  a  good  match  between  an 
image  block  X  and  its  assumed  statistics,  summarized  by  the  row  and 
column  covariance  matrices.  Since  images  are  typically  highly 
non-stationary,  a  multiple  class  KLT  scheme  was  adopted  here  to  better 
aid  in  employing  the  proper  statistical  assumptions  at  the  proper  time. 

In  this  approach,  a  number  of  different  pairs  of  row  and  column 
covariance  matrices  are  included,  each  describing  the  statistical 
characteristics  of  a  particular  class  of  imagery.  Then,  whenever  an 
image  block  of  that  class  is  to  be  transformed,  the  U  and  V  matrices 
previously  calculated  from  that  class's  statistics  are  employed  in 
extracting  the  KLT  coefficients. 


Specifically,  if  a  block  X  is  determined  to  belong  to  class  k,  then 
the  class  k  KLT  is  applied  to  X: 


2  -  U‘xvk, 


where  Uk,V^  satisfy  the  following  class  k  eigenvalue/eigenvector 
problems: 


1  rrow 
Lk 


V 


Uk  =  UkAk°w,  and 


1  c‘o1  V.  =  VtA^01  . 

5  k  k  k  k 

.  akn 

Because  the  inverse  KLT  is  class-dependent, 

X  =  U.  ZV* 
k  k  ’ 

it  is  necessary  to  encode  not  only  the  array  Z,  but  also  the  class  label 
k,  so  that  the  decoder  can  know  how  to  properly  inverse  transform  the 
coefficients  Z  it  receives  from  the  channel.  For  this  reason,  the 
number  of  classes  is  held  to  a  reasonably  small  number,  permitting 
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efficient  encoding  of  class  information  and  resulting  in  near  negligible 
overhead  associated  with  block  class  encoding.  In  this  study,  eight 
classes  were  employed. 


A  key  issue  associated  with  class-adaptive  coding  is  the  mechanism 
for  determining  a  block's  class.  Only  if  blocks  can  be  easily  and 
consistently  separated  into  meaningfully  distinct  classes  is  the  scheme 
useful.  Properly,  the  problem  of  identifying  meaningful  classes  and 
determining  reasonable  classification  schemes  is  a  problem  in 
unsupervised  pattern  recognition. 

Ideally,  a  number  of  block  features  would  be  examined  to  find  the 
optimal  class  boundaries,  and  the  feature  extraction  procedure  and 
classification  logic  would  be  analyzed  to  determine  the  best  tradeoff 
between  accuracy  of  correct  classification  and  computational  expense. 
Instead,  we  adopted  a  block  classifier  based  on  the  extraction  of  a 
single  scalar  feature  known  to  be  strongly  correlated  with  the  quantity 
of  information  contained  in  a  block.  We  thus  select  our  classes  to 
roughly  correspond  to  varying  levels  of  block  information  content  and, 
thus,  difficulty  of  compression. 

The  feature  employed  in  tnis  study  was  block  a.c.  energy,  defined 
as  the  mean  square  deviation  of  a  block's  pixel  values  from  the  average 
intensity  value.  That  is: 


The  feature  u(x)  is  a  good  measure  of  blocv  "busyness"  and  for  this 
reason  provides  a  high  correlation  with  block  information  content.  In 
addition,  since  it  is  based  upon  nergy,  and  both  U  and  V  are  unitary,  u 
can  be  calculated  in  either  the  pixel  or  transform  coefficient  domain. 

Based  upon  this  feature,  a  simple  classifier  of  the  following  form 
was  employed: 


VCRAC 

iMMMWUj 


Block  X  is  in  class  k  IF  tk_^<_  y(X)<tk 

Tne  decision  points  were  initially  left  unspecified,  and  an 
experiment  was  conducted  to  determine  the  best  choice.  Ouring  this 
experiment,  which  is  detailed  in  Section  7  of  this  report,  a  uniform 
spacing  of  the  t^'s  in  log  (u)  space  was  indicated  as  best,  and  was 
adopted  for  all  class-adaptive  applications. 

4.1.4  KLT  Computational  Algoritnms 

Three  types  of  calculations  are  associated  with  the  KLT: 

i  Determining  transformation  operators, 

•  Extracting  KLT  coefficients,  and 

•  Reconstructing  pixels  from  KLT  coefficients. 

The  first  type,  involving  construction  of  Ufc  and  Vfc  for  each 
class,  amounts  to  the  solution  to  2k  eigenvalue/eigenvector  problems, 
wnere  k  represents  the  number  of  classes  (8  in  this  study).  Each  of 
these  problems  entails  the  diagonalization  of  an  nxn  real,  symmetric, 
positive  semidefinite  matrix.  Since  n=16  in  this  study,  such  problems 
can  easily  be  solved  by  use  of  a  conventional  matrix  calculation  package 
such  as  UNPACK  [12].  Since  this  calculation  is  off-line  and  precedes 
actual  image  coding,  high  efficiency  is  not  required. 

For  the  KLT,  both  forward  and  inverse  transformations  are  performed 
by  straightforward  matrix  multiplication: 

Z  =  U*XV  and 

X  =  UZV1. 

3 

Thus,  2n  multiplications  and  additions  are  required  to  extract  KLT 
coefficients  or  to  reconstruct  pixels  from  coefficients. 
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In  general,  no  "fast"  KLT  algorithm  (akin  to  the  FFT)  exists, 
although  under  centain  assumptions  on  the  column  and  row  covariance 
matrices,  U‘s  and  V's  corresponding  to  the  sine  transform  can  be 
generated.  In  this  special  case,  the  FFT  can  be  used  to  effect  both  the 
forward  and  inverse  transformations.  However,  the  required  assumptions 
to  force  this  situation  violate  the  philosophy  of  fitting  the 
transformation  to  the  naturally  arising  class  statistics,  which  is  the 
whole  reason  for  including  the  KLT  in  this  study.  The  cosine  transform, 
which  is  very  similar  to,  and,  in  fact,  has  been  shown  to  be  superior  to 
the  sine  transform  in  a  number  of  cases,  is  already  included  in  the 
study  for  comparison,  so  including  both  would  not  illuminate  any  new 
performance  possibilities. 

4.2  KLT  Preprocessing 

KLT  preprocessing  entails  the  calculations  of  class-specific  KLT 
operator  matrices  and  Vk  and  coefficient  statistic  matrices  Mk 
and  2k  from  training  data.  The  process  is  illustrated  in  Figure  4-2 
and  consists  of  three  parts: 

•  Classify  blocks  of  training  imagery, 

•  Compute  transform  operator  matrices  and  predicted  statistics, 

and 

t  Col lect  empirical  statistics. 

4.2.1  Classify  Blocks 

The  block  classification  process  involves  the  two  steps  discussed 
in  subsection  4.1,  namely: 

•  Compute  activity  measure  v(x),  and 
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Figure  4-2.  KLT  Preprocessing 
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The  result  is  the  appending  of  a  label  k  to  each  block  of  training 
imagery.  Since  the  KLT  is  c lass-adapti ve ,  all  further  pre-processing  to 
be  aiscussed  is  class-specific,  in  the  sense  that  all  calculations  are 
performed  separately  for  all  class  1  (k=l)  blocks,  all  class  2  (k=2) 
blocks,  etc. 

4.2.2  Compute  KLT  Operator  Matrices  and  Predicted  Statistics 

The  class-specific  and  Vk  matrices  are  calculated  from 
class-specific  block  statistics.  Specifically,  the  following  three 
block  statistic  matrices  are  computed  for  each  clsss  k: 

•  =  AVG  [X  in  class  k] 

•  Rk°w  =  AVG  LXXt  in  class  kj 

•  r£o1  =  AVG  [ X in  class  k] 

If  these  statistics  are  accumulated  over  a  large  number  of  blocks 
from  a  variety  of  imagery,  they  can  be  expected  to  converge  to  their 
proper  values.  However,  whenever  the  training  set  if  finite,  residual 
structural  artifacts  may  remain  in  the  calculated  statistics.  To  help 
smooth  out  these  artifacts,  the  sample  space  of  training  imagery  can  be 
artifically  expanded  by  the  addition  of  new  members  synthesized  from 
original  members. 

In  particular,  supposed  denotes  the  sample  space  of  image  blocks 
X  obtained  by  partitioning  the  training  imagery  into  nxn  blocks.  The 
set  can  be  expanded  by  any  of  the  following  schemes: 

•  Re-partition  each  image  n  times,  so  that  block  boundaries 

shift  around  the  image,  causing  a  given  pixel  to  occupy  the 
2 

various  n  locations  of  a  block  exactly  once.  This 

eliminates  artifacts  due  to  block  location  within  an  image, 

2 

and  expands °K  by  a  factor  of  n  . 
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•  Flip  each  Dlock  vertically,  horizontally  and  both.  This 
eliminates  certain  artifacts  due  to  the  imaging  system's 
orientation  with  respect  to  the  scene,  ana  expands  X  by  a 
factor  of  four. 

•  Rotate  eacn  block  90°,  then  apply  flips  (good  only  for  square 
blocks).  This  eliminates  other  artifacts  due  to  the  imaging 
system's  orientation  with  respect  to  the  scene,  and 
expands  X  by  a  factor  of  four. 

In  this  study,  the  second  and  third  of  these  sample  space 
enhancement  schemes  were  employed  for  block  statistics  calculation.  The 
first  was  omitted  due  to  the  extremely  nigh  computation  and  storage  load 
associated  with  implementing  it,  and  because  the  training  set  was 
reasonably  large  to  begin  with. 

An  additional  structural  artifact  can  be  removed  by  introducing  the 
homogenous  mean  assumption.  That  is,  the  block  mean  EX^  is  assumed  to 
be  a  matrix  having  all  values  equal  to  uk ,  i.e.: 

yk  yk  •••  uk' 
yk  yk  •••  yk 
yk  yk  •••  yk 

=  M 

-  yk*M 

where  [1]  is  the  nxn  matrix  all  of  whose  elements  are  l's.  Since  any 
deviation  from  this  behavior  is  without  physical  justification,  the 
assumption  is  introduced  as  a  constraint  to  be  satisfied  during  the 
sample  mean  calculation.  This  means  that  instead  of  determining  Xk  by 
elementwise  averaging  over  the  X's  in  class  k,  u  is  calculated  by 
averaging  over  all  elements  of  all  X's  in  class  k: 

n 

1  V  x.  .  :  X  in  class  k 

-w—  L-i  ij 

n^  ij  =  l 
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Note:  This  can  also  be  written  as: 

n  _ 

Jk  -y-  S  Xk ,  1  j 
1,j«1 

wnere  X  =  [x  .  ],  which  is  tne  way  we  actually  implemented  it. 

X  K  ,  I  J 

—  f  r  0 1 

Once  Xr,Rk  and  have  been  obtained,  the  required 
sample  covariance  matrices  are  computed  from  one  of  the  following  pairs 
of  equations: 

•  without  homogenous  mean  constraint 


row 

RrOW 

-  X.xJ 

< 

K 

k  k 

col 
k  ’ 

col 

Kk 

•  With  homogenous  mean,  constraint 

C"  ■  Rkro"  •  -k  tn  -  uk  xk  [i]  *  tnm 

ck0'  ■  C  -  “k  £’]  Xk  -  uk  [1]  .  u*  [1][1] 

From  these  matrices,  the  class-specific  KLT  operators  Uk  and 
are  obtained  from  (see  Appendix  A): 


1 

T~ 

o.  n 


-row 


and 
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result.  When  the  assumption  is  not  valid,  the  second,  empirical 
approach  produces  a  more  accurate  estimate  of  the  statistics  of  the 
coefficients  actually  produced  by  application  of  Uk  and  Vk- 

Note  that  we  are  not  talking  about  recalculating  and  Vk 
under  more  general  covariance  assumptions.  Rather,  we  are  only  dealing 
with  oDtaining  a  more  accurate  estimate  of  the  average  properties  of  the 
coefficients  obtained  by  use  of  that  Uk  and  Vk>  The  degree  of 
disparity  between  the  statistics  calculated  by  the  two  methods  indicates 
the  degree  to  which  the  actual  training  data  departs  from  the  assumed 
separable  covariance  model. 

Empirical  statistics  are  obtained  by  applying  the  appropriate 
class-specific  KLT  to  each  block  of  training  imagery,  and  accumulating 
statistics  on  tne  resulting  coefficients.  In  particular,  two  items  are 
required: 

•  Mean  ,  and 

•  Standard  Deviation  2k 
for  each  class  k. 

Calculation  proceeds  in  two  steps.  First,  the  mean  and  mean  square 
coefficient  values  are  accumulated,  then  the  standard  deviations  are 
derived  from  this  data.  The  first  step  entails  the  following  averages: 

Mk  =  AVG  (UkXVk  :  X  is  class  k > ,  and 


where  rk  ^  «  AVG  (z^:  Z  =  UkXVk  and  X  is  class  k) 


I 


a  aMegfcALT. . 


Next,  the  standard  deviation  array  2k  is  obtained  by: 


[ok,ij] 


where  o 


2 

k,i  j 


The  various  sample  space  enhancement  tecnniques  discussed  under 
4.1.2  are  germane  to  empirical  statistics  calculation  as  well.  However, 
rather  than  expand‘d  ,  the  sample  space  of  X's,  directly,  it  is  possible 
for  the  flip  and  rotation  type  of  enhancements  to  expand  ^ ,  the  sample 
space  of  Z's,  instead.  This  is  of  great  practical  benefit  because  of 
the  large  computational  load  associated  with  applying  the  KLT  to  so  many 
blocks.  Appendix  8  shows  how  the  statistics  of  the  expanded  set  can  be 
calculated  from  the  statistics  of  the  original  set  ^  . 


4.2.4  KLT  Preprocessing  Summary 


To  summarize,  KLT  preprocessing  is  a  training  procedure  applied  to 
a  sample  space  of  blocks  obtained  by  appropriately  partitioning  a  set  of 
training  imagery.  The  result  is  the  generation  of  several  class 
specific  quantities: 


•  KLT  operators  and  for  each  class, 

•  Predicted  statistics  of  2  for  each  class,  and 

•  Empirically  collected  statistics  of  l  for  each  class. 

The  Uk  and  matrices  are  required  to  specify  the  class-adaptive 
KLT  operation,  while  the  coefficient  statistics  and  2k  are 
employed  to  efficiently  code  the  coefficients  produced  by  the  KLT 
operator. 

4.3  KLT  Image  Coding 

Figure  4.3  depicts  the  KLT  image  coder  employed  in  the  study.  The 
process  begins  by  extracting  an  nxn  block  X  from  an  image.  The  block  is 

/^VERflC 

________ ________ __ ______ _ ____ 

4-15 


■r 


rjctt.-. 


Figure  4-3.  Image  Coding  Chain 
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classified  by  extracting  the  a.c.  energy  activity  measure  u(X)  and 
comparing  the  resulting  value  to  a  set  of  decision  thresholas  jt^j. 

The  result  is  a  block  label  k.  Based  on  this  k,  the  proper  KLT  is 
applied  to  X  using  the  appropriate  and  Vk  matrices. 

The  resulting  KLT  coefficients  are  then  encoded,  and  placed  into 
the  output  buffer  for  formation  into  a  bit  stream.  Prior  to  this 
encoding,  a  rate  equalization  step  occurs  which  is  aimed  at  achieving  a 
particular  overall  coding  rate  (e.g.,  1  bit  per  pixel).  This  is 
achieved  by  computing  a  global  distortion  parameter  D  which  serves  to 
control  coefficient  coding  by  setting  the  fidelity  level  at  which  the 
coder  is  to  operate.  The  rate  equalization  algorithm  implemented  for 
the  KLT  coder  is  essentially  identical  to  that  implemented  for  the  SVD 
coder,  and  is  discussed  separately  in  Section  6. 

In  addition  to  the  KLT  operation  itself,  KLT  coefficient  coding  is 
also  a  class-adaptive  opercion.  This  permits  the  allocation  of 
relatively  more  channel  bandwidth  (number  of  bits)  to  high-information 
portions  of  the  image  than  to  low-information  portions.  This  is 
implemented  by  generating  more  bits  for  "busy"  (high  activity  measure  u) 
blocks  than  for  “quiet"  (low  activity  measure  u)  blocks.  The  result  is 
that  bandwidth  is  adaptively  allocated  to  the  various  blocks  within  an 
image.  Class-adaptive  KLT  coefficient  coding  is  disucssed  in  subsection 
4.3.2. 


The  decoding  operation  is  essentially  the  reverse  of  the  encoding 
process.  However,  because  the  KLT  operation  is  class-adaptive,  the 
decoder  must  be  provided  with  each  block's  class  label  in  order  to 
properly  inverse-KLT  the  reconstructed  coefficients  into  the 
reconstructed  pixel  block.  Thus,  the  block  labels  k  constitute  overhead 
information  which  must  be  encoded  and  entered  into  the  channel. 

Similarly,  the  coder  control  parameter  D  must  be  available  at  the 
decoder  in  order  for  coefficient  reconstruction  to  be  properly 
performed.  Thus,  this  parameter  also  constitutes  overhead  to  be  encoded 
and  entered  into  the  channel.  Overhead  coding  is  discussed  next,  in 
subsection  4.3.1. 
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4.3.1  Overhead  jEncoding 

Since  only  a  single  D  value  need  be  specified  for  each  image,  and 
since  only  one  of  a  small  number  of  possible  class  labels  need  be 
specified  for  each  block,  overhead  coding  does  not  consume  much  channel 
bandwidth.  Specifically,  since,  as  indicated  in  Section  6,  it  is  log ( D) 
which  controls  coder  fidelity,  an  8-bit  BCD  log-quantizer  was  adopted 
for  0.  For  k,  which  could  assume  one  of  eight  values,  a  simple  3-bit 
BCD  quantizer  was  employed. 

The  bandwidth  resources  consumed  by  encoding  this  overhead  is 
slight.  In  particular,  for  16  x  16  blocks  and  256  X  256  images,  the 
overhead  is: 

•  Distortion  parameter:  0.0001  bpp 

•  Class  labels:  0.012  bpp. 

Thus,  total  overhead  to  achieve  both  class  adaptivity  and  rate 
equalization  is  slightly  more  than  one  hundredth  of  a  bit  per  pixel. 
Since,  for  the  high  interest  imagery  under  study  here,  overall  coded 
rates  on  the  order  of  one  bit  per  pixel  are  of  interest,  the  overhead 
associated  with  this  scheme  is,  in  fact,  negligible. 

4.3.2  KLT  Coefficient  Coding 

The  key  aspect  of  the  KLT  coefficient  coder  is  that  it  is 
class-adaptive.  This  adaptivity  extends  into  two  domains: 

•  Interblock  adaptivity,  and 

•  Intrablock  adaptivity. 

Interblock  adaptivity  refers  to  the  distribution  of  total  bandwidth 
among  the  various  blocks  in  an  image  according  to  block  class.  High 
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activity-index  blocks  contain  more  information  and  are  thus  allocated 
more  bandwidth  than  are  low  activity-index  blocks,  which  contain  less 
information. 


Intrablock  adaptivity  refers  to  the  distribution  of  bandwidth  among 
the  various  coefficients  in  a  particular  coefficient  array  Z  according 
to  their  statistics.  Coefficients  with  a  high  degree  of  predictability 
(e.g.,  usually  small)  are  allocated  less  bandwidth  than  are  coefficients 
with  a  low  degree  of  predictability  (i.e.,  can  occur  over  a  wide  range 
of  values). 

Figure  4.4  illustrates  an  example  of  bit  assignment  arrays  for  two 
classes,  one  for  low-activity  blocks  and  the  other  for  high-activity 

blocks.  The  arrays  are  to  be  interpreted  as  assigning  the  number  of 

2 

bits  to  be  used  in  encoding  the  various  16  =  256  coefficients  within 

the  array  Z.  Thus,  the  3  in  the  (i,j)  =  (3,2)  position  of  the  first 
array  indicates  that,  for  low-activity  blocks,  z 32  is  to  be  coded  with 
a  3-bit  quantizer. 

The  figure  illustrates  both  types  of  adaptivity.  Interblock 

adaptivity  is  indicated  by  the  difference  in  the  total  number  of  bits 

2 

allocated  to  all  the  n  coefficients,  i.e.,  by  the  difference  in  the 
summations  over  all  elements  of  each  array.  Intrablock  adaptivity  is 
illustrated  by  the  preferential  allocation  of  bits  to  those  coefficients 
in  the  upper  left  hand  corner  of  the  arrays,  corresponding  to  the 
coefficients  which  typically  require  the  most  dynamic  range.  Note  that 
in  both  arrays  a  number  of  coefficients  are  allocated  no  bits  at  all, 
indicating  they  are  to  be  ignored  (not  coded).  These  are  the  typically 
insignificant  coefficients  ( approximated ,  for  example,  by  zero). 

Coefficient  coding  requires  resolution  of  two  issues: 

•  How  to  make  coder  assignments,  and 

•  What  quantizer  to  employ. 
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4. 3. 2.1  Coder  Assignment 


Coder  assignment  amounts  to  constructing  the  bit  assignment 
matrices  shown  in  Figure  4-4.  Since  we  are  dealing  with  an  eight-class 
situation,  eight  such  matrices  are  required. 

The  criterion  employed  for  coder  assignment  is  the  minimization  of 
mean  squared  coding  error  at  any  given  coding  rate.  Such  a  criterion 
results  in  a  bit  allocation  rule  which  distributes  the  mean-  squared 
error  uniformly  across  all  blocks,  and,  within  a  block,  uniformly  across 
all  coefficients.  To  achieve  such  uniformity,  such  a  rule  must  allocate 
more  bits  to  high  activity  blocks  than  to  low-activity  blocks,  and, 
within  a  block,  more  bits  to  strongly  varying  coefficients  than  to 
quiescent  coefficients. 

As  shown  in  Appendix  C,  this  criterion  results  in  the  following 
assignment  rule: 

B^Oc)  -  I  NT 


where:  •  B.  (k)  =  Number  of  bits  allocated  to  the  ij-th 

*  J 

coefficient  in  class  k  blocks. 

•  a.^(k)  =  Standard  deviation  of  the  ij-th 
coefficient  in  class  k  blocks.  (This  is  the  ij-th 
element  of  the  class-k  coefficient  standard 
deviation  matrix.  Either  predicted  or  empirical  values 
can  be  used. ) 

•  D  ■  Global  distortion  control  parameter 

(determined  to  provide  rate  equalization). 

•  I NT  [.]  =  The  integer  part  (required  because  we  are 
using  fixed  rate  quantizers). 


To  repeat,  this  rule  produces  adaptive  bit  allocation  because  it  results 
in  more  bits  being  assigned  to  coefficients  with  high  variability  as 
measured  by  large  a^j.  Because  high-activity  blocks  typically  have 
many  such  coefficients,  such  blocks  receive  more  bits  in  aggregate  than 
do  low-activity  blocks. 

4. 3. 2. 2  Coefficient  Quantization  and  Coding 

Once  a  coefficient  is  allocated  a  number  of  bits  for  encoding,  the 
next  question  is  how  to  employ  these  bits  in  effectively  encoding  the 
coefficient.  A  number  of  possibilities  exist,  but  the  Max  quantizer  was 
selected  here  for  its  optimality  properties.  The  key  requirement  for 
applying  the  Max  quantizer  is  that  the  probability  density  functions  of 
the  z-j  be  known. 

The  assumption  applied  is  that  all  coefficients  z^  share  the 
same  form  of  probability  density  function,  parameterized  by  mean 
m.Jk)  and  variance  o^-(k).  Thus,  the  derived  coefficients 
(z i  j— n>i  j(k)  )/ai  j(k)  all  share  the  same  zero-mean,  unit-variance 
PDF  p (z ) . 

For  this  study,  we  used  a  modified  Gaussian  function  for_p(z).  The 
Gaussian  assumption  is  justifiable  by  the  central  limit  theorem,  and  the 
modification,  which  slightly  boosted  up  the  tail  of  the  distribution, 
was  added  to  account  for  rare,  but  important  events. 

The  Max  quantizer  consists  of  a  set  of  quantizer  decision 
thresholds  and  an  associated  set  of  reconstruction  levels  selected  so 
that  coding  error  is  minimized  on  average.  It  results  in  a  non-uniform 
quantization  scheme  that  is  tailored  to  the  statistics  of  the 
coefficients.  For  example,  the  three-bit  Max  quantizer  which  minimizes 
mean  squared  coding  error  for  a  Gaussian  PDF  is  shown  in  Figure  4-5. 

Max  quantizers  of  1,2,  .  .  .,  8-bit  were  used  in  the  study. 


Although  a  Max  quantizer  can  be  used  on  all  coefficients,  special 
treatment  was  provided  the  coefficient.  This  is  because  the  basis 
block  corresponding  to  this  coefficient  is  invariably  near 
constant  in  intensity  over  all  pixels  in  the  block,  so  that  z^  is 
similar  to  the  cosine  or  Hadamard  Mdc“  coefficient. 

The  reason  for  special  treatment  for  "dc"  is  that  when  using  a  Max 
quantizer  —  even  an  8-bit  Max  quantizer  —  occasionally  severe  coding 
errors  are  committed.  These  errors  arise  where  the  coefficient  deviates 
most  from  its  assumed  mean  value,  since  there  the  Max  quantizer 
bin-width  is  largest,  and  the  potential  difference  between  the  actual 
coefficient  value  and  its  quantized  (reconstructed)  value  is  greatest. 

Because  "dc"  errors  are  perceived  as  "Dlockiness"  in  the  image, 
these  errors  are  potentially  more  perceptually  damaging  than  are  similar 
errors  encountered  for  a.c.  coefficients.  Thus,  in  place  of  a  Max 
quantizer,  a  uniform  quantizer  was  applied  to  the  d.c.  coefficient. 
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5.0  SINGULAR  VALUE  DECOMPOSITION  TRANSFORM  CODER 


The  singular  value  decomposition  transform  coder  uses  a  transform 
constituting  the  metnoa  of  principal  deterministic  components.  In  this 
method,  each  image  block  is  again  decomposed  into  a  sum  of  unit  norm 
oasis  clocks,  out  nere  tne  decomposition  acnieves  the  optimal  energy 
compaction  for  each  and  every  clock,  rather  than  merely  on  average  as  is 
tne  case  for  the  KLT.  Tnis  means  tnat,  here,  the  fewest  number  of 
coefficients  of  any  decomposition  is  required  for  efficient  image 
coding.  However,  in  contrast  with  the  statistical  approach  where  the 
transform  matrices  are  pre-computed  and  thus  available  to  both  coder  and 
decoder,  here  tne  transform  matrices  themselves  depend  upon  the  image 
Dlock  and  hence  must  themselves  be  coded  along  with  the  coefficients. 

The  transformation  usea  in  tnis  approacn  is  the  singular  value 
decomposition  (SVD),  given  by: 

S  =  U  h\l 

wnere  XXtU  =  U.\  ,  where  A  is  a  diagonal  array  of  non-negati .e 
elements  and  U  is  orthogonal; 

ano  X tX V  =  VA  ,  whereAis  tne  same  diagonal  array  of  non-negative 

elements  ano  V  is  orthogonal; 

1 12 

ano  wnere  S  =  t.\) 

Tne  matrix  S  is  diagonal  and  contains  the  singular  values.  Tne  matrices 
U  ano  V  have  as  their  columns  the  left  and  right  singular  vectors  of  X 
respectively.  Because  U  and  V  depend  upon  X,  all  three  matrices  —  S, 

U,  and  V  —  must  be  coded. 

Several  aspects  of  image  coding  using  tne  SVD  were  explored: 

•  Computational  algorithms. 
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Further  decorrelation  and  energy  compaction,  and 


Singular  value/vector  coding. 


The  overall  coding  chain  is  indicated  in  Figure  5-i.  It  is  very  similar 
to  cne  KLT  approach,  with  the  major  exceptions  being: 

•  Ine  SVD  automatically  adapts  to  the  block  X:  it  is  not 
class-specific,  and 

•  botn  singular  values  and  singular  vectors  are  coded. 

As  in  the  KLT  approach,  coefficients  (singular  values  and  vectors)  are 
coded  c lass-adaptive ly.  Since  the  SVD  is  a  unitary  transformation, 
delaying  the  extraction  of  the  activity  measure  u(X)  until  after  the 
forward  SVD  operation  has  no  effect  on  the  result  of  the  classification 
process.  In  fact,  the  computational  load  of  calculating  u(X)  is  less 
here  due  to  tne  hign  energy  compaction  produced  by  the  SVD  transforms, 
as  reflected  in  the  diagonal  structure  of  Z. 

The  remainder  of  this  section  presents  the  details  of  the  SVD  image 
coder.  Subsection  5.1  discusses  computational  algorithms;  subsection 
5.2  summarizes  additional  steps  potentially  yielding  further  energy 
compaction  or  decorrelation  of  singular  values/vectors;  subsection  5.3 
describes  the  preprocessing  required  to  support  SVD  coding;  and 
subsection  5.4  presents  the  new  schemes  developed  for  singular 
value/vector  coding. 

5.1  SVD  Computational  Algorithms 

Several  candidate  algorithms  for  calculating  the  SVD  were 
identified  during  the  study.  One  is  equally  applicable  for  computing 
the  KLT  matrices  during  KLT  preprocessing,  ano  was,  in  fact,  applied  for 
that  purpose.  (Since  KLT  preprocessing  is  an  initial,  off-line  training 
procedure,  computational  efficiency  is  not  an  issue  there.) 
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Figure  5-1.  SVD  Coding  Chain 


Two  classes  of  SVD  algorithms  were  identified,  the  direct  and 
indirect  methods.  In  the  direct  method,  a  block  X  is  decomposed  by 
directly  searching  for  ortnogonal  matrices  U  ana  V  such  that 

UtXV  =  S,  S  diagonal  and  non-negative. 

In  the  indirect  method,  the  intermediate,  symmetric  positive 
semi-definite  matrix  XXt  or  XtX  is  first  computed  and  its  eignvalues 
and  eignvectors  calculated  as: 

(XlX)U  =  UA  ,  or 

(XlX)V  =  VA  , 

in  wnich  U  and  V  are  orthogonal  and  A  is  diagonal  and  positive 
semi-definite.  In  fact,  S  =A^2,  i.e., 

A=  StS  =  SS*. 

Wnichever  of  U  or  V  is  calculated  from  the  eigenvalue/eigenvector 
problem,  the  other  is  obtained  directly  from: 


V  =  XtUA'1/2,  or 


U  =  XVA  , 

-1/2 

in  which  A  is  a  diagonal  matrix  having  elements  whim  are  the 
reciprocal  of  the  corresponding  elements  of A  when  non-zero,  and  zero 
otherwise.  In  this  way,  only  those  columns  of  V  or  U  which  correspond 
to  non-zero  singular  values  are  obtained  (they  are  the  only  ones  needed). 

Each  type  of  SVD  computation  method,  direct  and  indirect,  can  be 
tailored  separately  to  two  types  of  array  X,  a  block  of  pixels  and  a 
block  of  transform  coefficients,  resulting  in  the  four  algorithms 
examined.  (The  indirect  pixel  block  method  is  also  used  for  computing 
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Figure  5-2.  Direct  Pixel  Block  SVD  Calculation 
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Figure  5-3.  Indirect  Pixel  Block  SVD  Calculation 
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The  transformation  U-  is  selected  to  reduce  the  largest  off-diagonal 
elements  remaining  at  the  i-th  iteration. 

This  results  in 

U  =  11^2 

S  -A*^,  and 

V  =  X^JA-1^. 

Both  of  these  algorithms  are  generic  in  that  they  are  applicable  to 
extracting  the  SVD  of  any  array  X.  They  were  included  as  baseline 
techniques  against  which  to  compare  the  more  tailored  coefficient  block 
SVD  algorithm  and,  for  the  indirect  method,  as  a  means  of  extracting  the 
KLT  matrices. 

Coefficient  Block  SVD  Algorithms 

The  coefficient  block  SVD  approach  attempts  to  exploit  prior 
knowledge  about  typical  image  blocks.  For  example,  from  knowledge  that 
pixels  are  non-negative  and  highly  correlated  arises  the  fact  that  one 
left  and  one  right  singular  vector  must  be  close  to  uniform  (vector's  of 

%  'V 

all  l's  before  normalization).  Thus,  pre-transforming  by  a  U  and  V 

■v*  n, 

which  each  include  such  a  column  should  render  U  XV  closer  to  diagonal. 

In  addition,  the  known  regularity  which  often  occurs  in  image  block 
singular  vectors  can  be  anticipated  by  including  appropriate  columns  in 

'X,  'Xj  'W  % 

the  pre-transforms  U  and  V.  The  result  is  a  matrix  X  =  U  XV  which  is 
more  nearly  diagonal  than  is  X.  This  can  be  exploited  in  more  easily 
completing  the  diagonal ization  process.  The  overall  procedure  is  shown 
in  Figure  5-4. 

The  pre-transforms  employed  here  include  the  2D  cosine  and 
Hadamard.  Both  include  a  uniform  column  in  U  =  V  and  both  tend  to  mimic 
typical  singular  vector  structure. 
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Figure  5-4.  Finding  the  SVO  via  a  Coefficient-Block 
SVD  Calculation 


Not  only  do  such  pre-transforms  accelerate  the  diagonal ization  of 
X,  but  they  also  dovetail  nicely  with  the  singular  vector  coding 
approaches  developed  in  the  study.  These  approaches  are  discussed  in 
detail  elsewhere,  but  they  amount  to  taking  ID  correlating  transforms 
(e.g.,  cosine  or  Hadamard)  of  the  singular  vectors  of  X  and  encoding  the 
resulting  coefficients.  If  such  a  singular  vector  coding  approach  is 
employed  in  conjunction  with  a  pre-transform  intended  to  ease  SVD 
extraction,  a  particularly  convenient  synergism  occurs.  This  is  because 
the  normally  required  steps  of  backing-out  the  pre-transform  to  find  the 
singular  vectors  of  X,  followed  by  the  application  of  a  decorrelating  ID 
transform  to  these  singular  vectors  to  prepare  for  coding,  can  be 
eliminated.  In  particular,  if  the  pre-transform  is  the  2D  version  of 
the  ID  decorrelating  transform  (e.g.,  2D  cosine  and  ID  cosine)  the 
combination  of  inverting  the  2D  pre-transform  and  applying  the  ID 
transform  cancel  each  other  out.  This  is  illustrated  in  Figure  5-5. 
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Figure  5-5.  Collapsing  the  Inverse  2D  Pre-transformation  and 
the  Forward  ID  Singular  Vector  Transformation 
into  the  Identity 


The  result  is  an  efficient  algorithm  for  extracting  both  the 
singular  values  and  the  ID  transform  of  the  right  and  left  singular 
vectors  of  X.  The  process  is  illustrated  in  Figure  5-6.  This  was  the 
procedure  utilized  during  the  evaluation  phase  of  the  study. 


Both  direct  and  indirect  coefficient  block  SVD  algorithms  are 
possible  and  utilize  a  sequence  of  orthogonal  transformations  to 
diagonalize  the  appropriate  matrix.  In  the  direct  case,  transformations 
U.  and  Vj ,  are  applied  until 

(ITXV)  V^”VL 


is  approximately  diagonal.  In  the  indirect  case,  either  the  U.'s  or 
V.'s  are  applied  to  diagonalize 


yt.-.yt  (utXXtyJ  y...^ 


1 


or 


yt . • .yt 

VL  V1 


(VtXtXV)  v---vL. 
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Figure  5-6.  Combined  Algorithm:  Efficient  Extraction  of 
Both  Singular  Values  and  ID  Transform  Coefficients  of 
Singular  Vectors 


The  other  is  obtained  by  substitution  as  in  the  indirect  pixel  block 
algorithm. 

The  indirect  method  was  adopted  here  due  to  the  availability  of 
existing  code  to  implement  it,  and,  as  discussed  in  Section  7,  because 
SVD  coder  performance  turned  out  not  to  be  good  enough  to  warrant  a 
thorougn  investigation  of  the  relative  merits  of  the  other  SVD 
computational  approaches. 

5.2  Decorrelation  and  Energy  Compaction  of  SVD  Coefficients 

Prior  work  utilizing  the  SVD  transformation  for  image  compression 
recognized  the  statistical  correlations  that  typically  occur  within 
singular  vectors  [II].  In  that  work,  a  predictive  coding  scheme  based 
on  DPCM  coding  of  singular  vectors  was  employed  to  exploit  the 
correlation.  However,  it  is  well  known  that  such  an  approach  only 
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removes  some  of  the  correlation.  Efficient  encoding  demands  that  as 
much  correlation  as  possible  be  removed,  which  suggests  applying  a  ID 
transform  to  the  singular  vectors  to  decorrelate  them. 


The  optimum  transform  for  decorrelating  a  vector  is  the 
Karnunen-Loeve  transform.  However,  for  maximum  adaptibility  to 
non-stationari ty,  a  class-aptiti ve,  singular  vector-specific  ID  KIT  is 
best.  In  such  a  scheme,  the  particular  KLT  operator  applied  would 
depend  upon  the  block's  class  and  the  singular  vector's  index  (location 
within  U  or  V  as  appropriate).  For  our  case,  we  have  eight  classes  and 
sixteen  left  and  sixteen  right  singular  vectors  requiring  a  total  of  8* 
16* lb  =  2048  16x16  KLT  operator  matrices.  Even  in  the  case  where 
statistical  distinctions  between  left  and  right  singular  vectors  are 
ignored,  8*16  =  128  such  matrices  are  required. 

Such  a  storage  load,  coupled  with  the  computational  load  required 
to  perform  the  KLT  extraction  via  matrix  multiplications,  suggests  that 
suboptimal  transformations  possessing  a  "fast"  implementation  be 
investigated.  This  was  also  indicated  in  order  to  efficiently  combine 
the  2D  pre-transformation  discussed  in  Section  5.1  with  the  ID  singular 
vector  transformation.  Thus,  we  investigated  two  ID  transforms  for 
singular  vector  decorrelation: 

•  Cosine,  and 

•  Hadamard. 

The  first  was  included  because  of  its  known  success  at  approximating  the 
KLT’s  optimal  decorrelating  performance.  The  second  was  included  due  to 
its  particular  computational  efficiency. 

As  an  example  of  the  effect  of  applying  such  a  transform.  Figure 
5-7  snows  some  results  for  the  cosine  transform  case.  The  first  plot 
shows  an  example  singular  vector,  in  this  case,  the  third  left  singular 
vector  from  a  particular  image  block.  The  second  plot  shows  the 
corresponding  ID  cosine  transform  coefficients.  Note  the  correlation 
from  element  to  element  in  the  singular  vector  and  both  the  lack  of  such 
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Figure  5-7.  Example  Singular  Vector  and  its  Cosine  Transform 
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correlation  and  the  occurrence  of  energy  compaction  in  the  transform  as 
evidenced  by  the  appearance  of  some  coefficients  with  significantly 
larger  magnitude  than  others. 

That  this  behavior  is  even  more  striking  on  average  is  shown  in 
Figure  5-8.  In  this  figure,  the  first  plot  illustrates  the  mean  and 
standard  deviation  obtained  by  accumulating  over  all  third  singular 
vectors.  The  corresponding  quantities  for  transform  coefficients  are 
shown  in  the  second  plot.  The  marked  peak  in  the  second  plot  confirms 
the  energy  compacting  property  of  the  transform.  When  coding, 
coefficients  corresponding  to  such  peaks  will  be  more  accurately  coded 
than  will  other,  less  important  coefficients. 


Figure  5-9  illustrates  the  two  alternative  implementations  of  the 
ID  singular  vector  transformations.  The  first  approach  is  the 
straightforward  one,  in  which  the  SVD  is  first  calculated  and  the  ID 
transform  of  the  resulting  singular  vector  is  then  obtained.  The  second 
is  the  combined  algorithm  of  Figure  5-6  which  permits  coordination  with 
computation  of  the  SVD  itself.  The  equivalence  is  demonstrated  by 

S  =  l^XV 


X 


usv1 


,w 

irxv 


'H 

ITUSV  V 


( U^U )  S  (V^)1 


'W  a.  'X; 

S  =  (irur  (UtXV)  (V^V) 


which  shows  that  if  |S,U,v}  constitute  the  SVD  of  X,  then 
jS.^U.^V}  constitute  the  SVD  of  L^XV.  Thus,  l^U  and  V^V  can 
be  obtained  in  either  of  two  ways: 


VERAC 

licjwwjhj 


Find  SVD  of  X,  then  take  the  ID  transform  of  the  columns  of  U 
and  V,  or 
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Figure  5-9.  Two  Implementations  for  Extracting 
Singular  Values  and  Singular  Vector  Transform  Coefficients 

•  Find  tne  SVD  of  UtXV  directly. 

In  addition  to  the  use  of  the  ID  singular  vector  decorrelatina 
transformation,  three  other  techniques  were  identified  as  potentially 
useful  for  either  additionally  decorre lating  the  elements  of  S,  U,  and 
V,  or  for  introducing  further  energy  compaction.  Tnese  techniques  are: 

•  SVO  reordering, 

•  Singular  vector  orthogonalization,  and 

•  Repolarization  of  singular  vectors. 
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Eacn  of  these  techniques  is  described  in  turn. 


5.2.1  SVD  Reordering 

The  SVD  forward  transform  takes  the  form 

S  =  l^XV, 

wnere  S  is  diagonal.  The  inverse  transform  takes  the  form 
X  =  USV1 

which,  because  S  is  diagonal,  can  be  rewritten  as 

n  t 

X  =  S.s^vf 

i=j  1  1  1 

2 

which  expresses  X  as  a  weighted  sum  of  n  (not  n  )  basis  blocks 


There  is  no  inherent  ordering  to  the  terms  in  this  expression.  In 
fact,  permuting  the  ordering  merely  results  in  permuting  the 
corresponding  columns  of  U  and  V,  and  diagonal  elements  of  S. 

The  normal  default  ordering  is  usually  selected  to  result  in  s's 
with  monotonical ly  decreasing  size,  i.e.,  monotonical ly  decreasing 
js.j.  However,  statistical  analyses  conducted  under  this  study  suggest 
that  the  singular  vectors  ordered  in  this  way  also  typically  have  their 
strongest  energy  concentrated  at  monotonically  increasing  frequencies 
(or  sequencies). 

Figure  5-10  illustrates  the  migration  of  this  energy  concentration 
to  higher  frequencies  for  three  singular  vectors.  The  first  singular 
vector  has  most  of  its  energy  concentrated  at  lowest  frequencies.  The 
third  singular  vector  has  most  of  its  energy  concentrated  at  somewhat 
higher  frequencies.  The  twelfth  singular  vector  has  most  of  its  energy 
at  still  higher  frequencies. 
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This  observation  suggests  that  if  ordering  were  performed, 
explicitly  in  terms  of  where  singular  vectors  have  peak  energy  instead 
of  in  terms  of  which  singular  values  are  largest,  a  very  similar 
ordering  would  result.  However,  the  ordering  may  be  different  often 
enough  that  when  statistical  averages  are  taken,  even  sharper  peaking  of 
average  coefficient  energy  would  result.  This  would  then  permit  even 
more  efficient  encoding  of  singular  vectors,  since  their  energy  would 
be,  on  average,  more  predictably  concentrated  is  certain,  known 
coefficients.  There  would  be  a  concomitant  decrease  in  the  peakyness  in 
the  singular  value  statistics,  but  this  would  probably  be  more  than 
compensated  for  by  the  increased  peakyness  of  singular  vector 
coefficient  statistics. 

This  reordering  was  implemented  and  evaluated  against  no 
reordering.  The  results  are  reported  in  Section  7. 

5.2.2  Singular  Vector  Orthogonal ization 

This  enhancement  represents  an  attempt  to  exploit  the  known 
orthogonal  structure  of  the  singular  vector  arrays  U  and  V.  It  results 
in  a  structure  somewhat  different  from  that  otherwise  applied  for  coding 
singular  vectors. 

Up  to  this  point,  singular  vector  coefficient  coding  was  hanaled 
simultaneously:  after  the  10  transform  was  applied,  all  the 
coefficients  were  encoded  at  once.  In  the  current  enhancement,  the 
structure  is  different:  first,  some  coefficients  are  extracted,  then 
they  are  coded,  then  other  coefficients  are  extracted,  and  then  they  are 
coded.  This  process  cycles  until  all  coefficients  are  extracted  and 
coded. 

This  enhancement  is  intended  to  exploit  the  known  -edundancy  in  the 

arrays  U  and  V.  Here  we  will  focus  on  the  left  singular  vector 

coefficient  array  U  U.  For  notational  simplicity,  we  will  denote  this 

array  simply  as  U  during  the  remainder  of  this  discussion,  although  the 

%t 

process  is  applied  to  the  coefficient  array  U  U. 
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COEFFICIENT  NUMBER 


I«(«««nU4 


Figure  5-10.  Example  of  Migration  of  Energy 
Concentrated  with  Frequency 


First  we  note  tnat  the  j-th  column  of  U,  u,,  is  orthogonal  to  all 

v 

previous  columns  u^,  u£",  u.j_p  Suppose  a  change  of  basis  is 
introduced,  with  the  new  basis  being 


Bj  -  C— l ,u2’ ' ' * ,uj-i’ bjj*bnjj’ 


where  |bjj  , simple  completes  the  basis  in  Rn. 
u^  can  be  represented  by: 


n 

-  L 

N 


ai  j-i  j’ 


The  vector 


since  ijj  is  orthogonal  to  |upU2, "  '  ,Uj_^}  so  that  is 
linearly  independent  of  jupU2» **'  •  Thus,  instead  of 

having  to  transmit  the  n  elements  of  u.  directly,  only  the  (n-j+1) 

yj 

coefficients  {«. j,* ** ,anj J  need  be  transmitted,  as  long  as  the 
|5.i  4  are  dvai,dble  to  both  transmitter  and  receiver.  But  the  jb^j} 
can  be  computed  from  the  previously  transmitted  singular  vectors 
{Ui}  :j“p  so  that  the  process  is  realizable. 


When  repeated  for  each  i,  this  process  results  in  an  array  of  a's 
wnich  can  be  collected  into  the  following  upper  triangular  form: 


HI 

a2 1  a22 

•  •  ’ 

•  • 

•  • 

a  ,  a  o 

.  nl  n2 


nn  J 


o 

Thus,  the  n  elements  of  U  can  be  completely  represented  by  the  n(n-l ) 
coefficients  [a^]. 


The  most  obvious  coding  strategy  based  on  this  representation  is  to 
independently  code  the  individual  a-  's  using  statistics  collected 

1  J 

during  a  pre-processing  statistical  analysis.  However,  in  this  method, 
the  reconstruction  accuracy  of  later  singular  vectors  is  very  sensitive 
to  errors  in  earlier  ones. 
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lis  can  be  seen  by  examining  the  reconstructed  value  of  u.,  u.: 


U.  =  t  a  ■  ■ 


1J  "'J 


Here,  not  only  is  2-  .  an  approximation  to  a,  .,  but  so  is  b ■  ■  an 

'  J  '  J  1  J 

approximation  to  b,  ..  And  although  the  error  in  2- •  is  independent 
of  the  error  in  all  other  a^'s,  the  error  in  depend  upon  the 
previous  reconstructed  values  j  u.j  »£j  j}  which  themselves 

depend  upon  J 2pq  :  q<p,  q<j  }  . 

To  aviod  this  dependency  and  to  thereby  reduce  average  coding 
errors,  we  use  a  different  basis 

where  j, ’*  *  }  =  {upUg,*  * ’  ,0^ }  .  The  price  we  pay 

is  that  the  coefficient  matrix  of  a^'s  is  no  longer  triangular  —  it 

is  in  general  full.  However,  the  elements  occuring  in  the  upper 

triangle  (the  a, H's  for  i<j)  will  typically  be  small  as  long  as  u, 

is  a  reasonable  approximation  to  u,.  They  can  therefore  either  be 

J 

neglected  altogether,  or,  as  we  shall  do,  be  more  coarsely  quantized 

tnan  those  a.  ,'s  in  the  lower  triangle  (a-  .'s  for  jo). 

'  J  1  J 

5.2.3  Repolarization 
In  the  SVD  expansion 

X  =  USV1 

n 

a  S^Uj V^ 

i=l  1“1“1 

there  is  a  fundamental  question  of  polarity  of  the  various  members  of 
each  term.  In  particular,  the  term  s.jj.v^  has  a  definite  sign, 
buc  the  individual  members  s.,ij.  and  v_.  do  not,  so  long  as 
their  product  works  out  to  have  the  correct  polarity. 
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Normal  default  in  SVD  is  to  choose  s^O.  That  still  leaves 

and  y-'s  polarities  unspecified,  but  constrained  so  that  their  product 

is  of  the  correct  sign.  The  additional  condition  we  impose  is  that  the 

sum  of  the  areas  of  the  two  vectors,  Zu- >v.  .,  be  non-negative. 

j  *  J  '  J 

This  condition  is  basically  a  default  selected  to  help  minimize  the 
dispersion  of  singular  vector  coefficient  statistics. 

Repolarization  is  an  alternate  scheme  intended  to  further  reduce 
the  dispersion  of  singular  vector  coefficient  statistics.  It  is  based 
upon  the  motive  of  providing  a  consistent  sign  for  the  largest  energy 
component  of  eacn  singular  vector.  The  scheme  consists  of  assigning 
signs  to  u_-  and  v_.  that  result  in  their  both  having  their  largest 
magnitude  transform  coefficient  be  positive.  The  sign  of  s .  is  then 

t  1 

adjusted  to  give  the  term  s.u, v,  the  correct  polarity. 

The  price  for  this  repolarization  is  that  singular  values  are  no 
longer  guaratneed  to  be  non-negative  and  thus  display  increased 
dispersion  in  their  statistics.  However,  having  both  polarization 
methoas  available  permits  an  evaluation  of  which  effect  dominates,  the 
decrease  in  dispersion  of  singular  vector  coefficients,  or  the  increase 
in  dispersion  of  singular  values. 

5.3  Preprocessing 

SVD  preprocessing  is  required  for  the  same  reason  KLT  preprocessing 
is,  as  a  training  step.  Since  the  singular  values  and  the  singular 
vector  coefficients  are  coded  using  statistically-optimized  coding 
schemes,  tne  underlying  statistics  are  required. 

What  are  required  are: 

#  Singular  value  statistics,  and 

•  Singular  vector  statistics. 


r 


The  procedure  for  obtaining  this  information  is  shown  in  Figure  5-11. 
This  process  constitutes  an  empirical  SVO  statistics  calculation.  It 
begins  wiih  tne  forward  SVD  transforming  of  the  various  blocks  in  the 
training  imagery.  Since  coding  is  again  class-adaptive,  separate 
class-specific  statistics  are  required.  Also,  since  several  different 
ordering,  polarization  methods  are  to  be  evaluated,  several  versions  of 
the  statistics  are  required.  These  include: 

•  Default  singular  value/vector  ordering  and  polarization, 

•  Singular  value/vector  re-rodering, 

•  Singular  value/vector  re-polarization,  and 

•  Both  re-rodering  and  re-polarization. 

In  each  case,  the  same  SVD  is  applied,  and  the  results  simply 
reorganized  as  reqirea.  (As  previously  discussed  in  5.1,  the  order  of 
the  SVD  and  transform  operations  can  be  interchanged.) 

The  statistics  required  are  the  first  and  second  moments  of  the 
various  entities  to  be  coded.  Specifically,  let  s^  denote  the  i-th 
singular  value,  and  u^  and  v_.  the  transform  coefficient  vectors  for 
the  corresponding  singular  vectors.  Then  the  statistics  calcualted  are: 

•  Singular  values 

.  3  AVG[s  ■  :  x  is  class  k] 
s2k  i  .  AVG  Cs?  :  x  is  class  k] 

•  Singular  vector  coefficients 

-  AVG  [U  :  x  is  class  k] 
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AVG  [V  :  x  is  class  k] 


pleft  =  r  left] 

k  L  k.ijj 

•  (rMj)2 

u2^  :  X  is  class  kj 

plight  =  f  right 

k  L  k.ij  . 

]  •  Rl?$')2  ■ »« 

2  T 

v.j  :  X  is  class  kj 

From  these,  tne  required  standard  devitions  can  be  computed  as 

K«)2  ■ 

s2  -  ?2 

sk,i  sk,i 

yjeft 

R:n]-K?n)2  ■ 

/left  \  2  -2 

(rk,ij  )  "  Uk,i j 

y,right 

[<n  k:?s*)2  ■ 

/right]2  _  -2 
\k,i j  /  k.ij 

The  efficient  sample  space  enhancement  techniques  applied  to  smooth 
out  structural  artifacts  in  the  KLT  case  can  also  be  applied  here. 
Specifically,  both  "flips"  and  "rotation"  can  be  applied.  Appendix  D 
discusses  how  to  implement  these  techniques  on  SVD's  of  pre-transformed 
data,  which  is  the  case  of  interest  here. 


In  oraer  to  encode  the  coefficients  resulting  from  the  singuar 

vector  orthogonal  expansion  enhancements  discussed  in  5.2.2,  the  first 

two  moments  of  the  orthogonal  expansion  coefficients  are  required. 

Appendix  E  computes  expressions  for  these  quantities  which  allow  their 

calculation  from  the  statistics  of  js.,u.,  and  v.L 

I  r-r  -i  | 


5.4  Singular  Value/Vector  Coding 


To  insure  adaptibility  to  non-stationarity,  a  class-adaptive  coding 
scheme  is  employed.  Figure  5-1  illustrated  the  SVD  coding  chain  and 
indicated  the  place  of  overhead,  singular  value,  and  singular  vector 
coding  in  the  overall  arrangement. 
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As  in  the  KLT  case,  overhead  consists  of  the  global  rate  distortion 
parameter  D  and  each  block's  class  label  k.  Both  are  required  to  permit 
correct  singular  value/vector  reconstruction  at  the  decoder.  (Note  that 
although  the  SVD  transform  is  not  class-dependent  as  was  the  KLT,  the 
class  labels  are  nontheless  needed  at  the  decoder  for  correct  singular 
value/vector  reconstruction.)  SVD  overhead  coding  is  identical  to  KLT 
overhead  encoding. 

The  overall  singular  value/singular  vector  coding  problem 
constitutes  a  hierarchy  of  coding  problems,  which  pose  the  following 
questions: 

•  How  are  bits  allocated  among  blocks, 

•  How  are  bits  allocated  to  terms  in  a  particular  block's  SVD 
expansion, 

f 

•  How  are  bits  distributed  among  the  singular  value  and  two 
singular  vectors  in  particular  terms,  and 

•  What  coders  are  best  for  use  on  singular  values  and  singular 
vectors? 

5.4.1  Bit  Allocation 


To  obtain  solutions  to  these  problems  we  again  adopt  the  following 
global  problem  statement: 

MINIMIZE  :  Total  mean  squared  coding  error 

SUBJECT  TO  :  Not  exceeding  a  given  coding  rate 

and  specify  the  use  of  fixed  rate  coders  (coders  which  produce  codewords 
whose  lengths  do  not  depend  upon  the  input  values  to  be  coded). 

l^VCRAC 

5-25 

— — - - - 


As  demonstrated  in  Appendix  C  (in  the  context  of  KLT  coding)  the 
optimal  solution  dictates  allocating  bandwidth  to  achieve  a  uniform 
distribution  of  coding  error  over  all  blocks.  This  implies  that  busier 
blocks  are  encoded  with  more  bits  than  are  quieter  blocks.  Furthermore, 
Appendix  C  shows  that  each  block's  bit  allocation  problem  can  be j 
separately  solved.  Appendix  F  addresses  this  problem  (the  second  and 
third  in  the  list),  and  shows  that  bandwidth  should  be  allocated  so  that 
coding  error  is  distributed  uniformly  over  all  terms  s.u.._v| 
the  SVD  expansion 

This  means  that  those  terms  s.^.v^  which  have  the  most 
variation  in  energy  will  be  allocated  the  most  bits;  those  which  are 
more  predictable  receive  fewer  bits. 

Appendix  F  also  shows  that  the  bit  allocation  problem  can  be  solved 
separately  for  each  term,  and  that  the  optimal  solution  has  the 
fol lowing  features: 

•  the  singular  value  s^  is  allocated  bits  according  to  its 
variability,  as  given  by  its  class  k  standard  deviation, 

a-(k), 

•  the  singular  vectors  ik  ,v^  are  allocated  bits  according 

to  the  average  size  of  the  corresponding  singular  value  s., 
as  given  by  its  class  k  RMS  value,  rs(k),  and 

•  bits  are  distributed  among  the  ID  transform  coefficients  of 
the  singular  vectors  to  achieve  uniform  coding  error  in  each 
coefficient. 
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The  matnematical  form  of  the  allocation  rule  is  as  follows: 


where  Bs-  =  number  of  bits  allocated  to  ss 

Bu-  •  »  number  of  bits  allocated  to  ui,, 

=  number  of  bits  allocated  to  v,,, 

*  J 

=  standard  deviation  of  si  in  class  k, 

=  Zi2 

i  k,i , 

=  standard  deviation  of  ui  .  in  class  k, 

=  standard  deviation  of  v. .  in  class  k,  and 

*  v 

=  global  distortion  control  parameter. 

(This  rule  is  an  approximation  based  on  a  particular  curve  fit  to  the 
performance  characteristics  of  the  fixed  rate  coders  used  to  quantize 
the  singular  values/vectors.) 


;i 


As  an  example,  the  bit  assignment  matrices  shown  in  Figure  5-12 
were  calculated  from  these  rules.  The  figure  illustrates  the  bit 
assignments  for  the  singular  values  and  left  singular  vector  transform 
coefficients  (right  singular  vector  transform  coefficients  are  similar) 
for  two  classes. 


Both  inter-  and  intrablock  adaptivity  are  in  evidence.  The  former 
obtains  from  the  difference  in  total  number  of  bits  assigned,  the  latter 
from  the  selective  allocation  within  the  arrays.  Note  the  larger 
allocations  to  the  first  few  singular  values/vectors,  which  are  the  ones 
with  largest  energy.  Note  also  the  preferential  allocation  within 
columns  of  the  singular-vector-coefficient  bit-allocation  matrix, 
reflecting  the  energy  compaction  properties  of  the  transform. 
Additionally,  note  the  evidence  of  the  centroi d-of-energy  migration  from 
lower  coefficient  incices  (top  of  column)  to  higher  indices  (botton  of 
column),  as  reflected  by  the  shifting  bit  allocation  pattern  as  we  move 
from  the  first  few  singular  vectors  (left  side  or  array)  to  the  last  few 
singular  vectors  (right  side  of  array).  Finally,  note  that  many 
singular  value/vector  combinations  are  not  coded  at  all.  This  is  a 
result  of  the  highly  efficient  energy  compaction  into  the  first  few 
terms  in  the  SV D  expansion  provided  by  the  SVD  transform. 


S i nqu 1 ar  Value/Vector  Coders 


Figure  5-13  illustrates  the  singular  value  statistics  specific  to  a 
particular  class  of  image  blocks.  Because  each  singular  value  extends 
over  a  fairly  narrow  range  (approximately  constant  in  log  space,  except 
for  the  last  one),  and  because  high  fidelity  singular  value  coding  was 
desired  (for  the  same  reason  high  fidelity  "dc"  coding  is  in  the  KIT 
case),  we  selected  a  uniform  quantizer  for  encoding  singular  values. 
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Figure  5-13.  Example  Singular  Value  Statisti 


The  procedure  employed  to  encode  singular  values  was: 


•  Suotract  mean:  s^  =  s-  -  s^ 

•  Use  a  uniform  quantizer  of  Bs.  bits,  with  lower  end  -D  ; 
and  upper  end  D,  where 

o  =  VT  =  vx  ■}. 


The  value  0  describes  the  extent  of  a  uniform  pdf  with  standard 
deviation  a*.  Note:  the  s^  case  was  handled  slightly 
differently;  the  quantizer  extent  was  stretched  to  exend  down  to  zero 
instead  of  -D,  if  necessary.  This  is  the  precise  analog  of  the  special 
"dc"  treatment  included  for  the  KLT,  and  is  included  to  insure  equally 
accurate  quantization  of  average  grey  level  for  all  blocks,  and  to 
thereby  minimize  blockiness. 

As  useo  in  the  KLT  case  for  transform  coefficient  coding,  a  Max 
quantizer  was  applied  to  encode  singular  vector  transform  coefficients 
in  the  SVO  case.  A  tail-modified  Gaussian  pdf  was  assumed,  and  Max 
quantizers  of  1,2,  ***,  8-bit  length  were  used.  The  coding  procedure 
was: 


•  Subtract  mean  from  coefficients, 

•  Normalize  by  standard  deviations  of  coefficients, 

•  Encode  coefficients  using  BCD  representation  of  the 
quantization  levels  obtained  from  Bu^  or  Bv.^  (as 
appropriate)  bit  Max  quantizers. 

5.4.3  Coding  the  SVD  Orthogonal  Expansion  Coefficients 

In  this  enhancement  singular  vectors  are  treated  differently.  The 
procedure  is  cyclic,  and  is  repeated  for  each  of  the  left  and  right 
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singular  vectors.  For  concreteness,  suppose  jjj  (the  vector  of 
transform  coefficients  for  the  j-th  left  singular  vector)  is  to  be 
coded:  A  single  cycle  consists  of  the  following  steps. 

•  Find  the  basis  matrix  B., 

J 

•  Find  the  coefficients  a.  = 

J 


•  Quantize  a.  and  output  the  resulting  codeword  to  the 

J 

buffer,  and 

•  Reconstruct  values  for  next  cycle. 

A  similar  set  of  steps  produces  £.'s  from  the  V.'s. 

<3  J 

The  theory  to  support  this  process  was  covered  in  Section  5.2.2. 
The  topic  here  encompasses  only  the  quantizers  and  bit  assignments  used 
for  encoding  the  o^.'s.  For  the  quantizer,  the  same  modified-Gaussian 
Max  quantizer  employed  for  the  KIT  and  the  other  SVD  algorithms  is 
employed  here.  Coding  is  performed  by  the  following  procedure: 

•  Determine  Bsi  as  in  §  5.4.1 

•  Determine  Bu.  .  and  Bv.  .  as  i n §  5.4.1,  using  statistics  of 

0  '  J 

o.  .  in  place  of  those  of  u.  .  (and  those  of  s.  .  in  place 
of  tnose  of  v,j) 

•  Encode  using  a  Bu^-bit  quantizer,  and  using 
Bv.  .-bit  quantizer. 

*  J 

At  the  conclusion  of  this  cycle,  the  next  iteration  is  begun.  This 
consists  of  incrementing  j  to  j+1  and  repeating  the  above  process  for 

-j+1  and  ±j*r 


aij 

®2j 
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6.0  RATE  EQUALIZATION 


Rate  equalization  is  the  process  of  meeting  a  global  target 
compressea  rate  (in  Dits  per  pixel).  Since  the  coding  algorithms 
employed  in  this  study  are  class-adaptive  and  since  it  is  generally  not 
known  aneaa  of  time  how  many  clocks  of  eacn  class  are  present  in  a  given 
image,  it  is  necessary  to  regulate  the  coding  process  in  order  to  adapt 
to  changing  class  population  from  image  to  image  [6]. 

The  concept  upon  which  rate  equalization  is  based  is  illustrated  in 
Figure  6-i  whicn  snows  tne  overall  coder  performance  curve.  The  curve 
snows  how  the  coder's  output  rate  (average  bits  per  pixel)  depends  upon 
a  control  parameter  0.  The  parameter  is  a  measure  of  the  distortion 
aoded  during  cooing:  in  order  to  achieve  a  small  coded  rate  a  large  D 
is  necessary;  for  larger  coded  rates,  a  smaller  D  will  do.  The  task  of 
rate  equalization  is  to  select  the  D  that  meets  the  target  rate.  Since 
the  curve  depends  upon  not  only  the  coder,  but  also  the  image  being 
encoded,  the  problem  in  non-trivial. 

Tne  metnod  of  rate  equalization  employed  in  the  study  is  predictive 
rate  equalization.  This  means  that  it  is  performed  prior  to  coding  the 
image.  That  is,  no  trial-and-error  coding  is  required  to  meet  the 
target  rate.  The  correct  value  of  D  can  be  determined  before  any  coding 
commences. 

Tne  determination  of  the  correct  global  distortion  parameter  D  is 
based  on  two  types  of  information: 

•  class-specific  transform  coefficients  statistics,  and 

•  class-populations. 

Thus,  D  only  depends  upon  aggregate  image  information;  it  does  not 
depend  upon  the  actual  image  data  (pixel  values)  themselves. 

The  rate  equalization  problem  is  solved  by  determining  the  value  of 
0  which  satisfies  the  following  condition: 
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Figure  6-1.  Rate  Equalization 


Total  bits 
for  image 


£ 

class 

k 


(6.1) 


wnere  Nk  *  number  of  blocks  of  class  k 

8k(D)  »  number  of  bits  allocated  to  class  k  blocks 

Tne  quantity  Bk(0)  represents  tne  total  class  k  bit  allocation 
and  is  obtained  by  summing  over  all  elements  of  the  class-k  bit 
allocation  array.  For  the  KLT  case,  the  expression  for  6k(D)  is: 


where  <».j(k)  is  the  standard  deviation  of  tne  ij-th  element  of  the  KLT 
coefficient  array  Z  obtained  from  class  k  blocks.  For  the  SVD  case,  the 
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expression  is  more  complicated;  since  singular  values,  left  singular 
vectors  and  right  singular  vectors  must  all  be  coded: 


V°> 


(k)r*(k) 

*  Y 


(6.3) 


+ 


n 

Z  1og? 
1  L 


right 

CTj,j 


D«y 


(k)r?(k)l 

- J.  < 


wnere 


standard  deviation  of  j-th  singular  value  for 
class  k  blocks 


r>) 


RMS  value  of  j-tn  singular  value  for  class  k 
blocks 


left,.  . 
a  ■  (k ) 
u  J 


ri  gnt 

°i  j 


l*) 


standard  deviation  of  i-tn  transform 
coefficient  of  j-th  left  singular  vector  for 
class  k  blocks 

stanaaro  deviation  of  i-th  transform 
coefficient  of  j-th  right  singular  vector  for 
class  k  blocks 


Y 


a  constant  of  proportionality. 


The  rate  equalization  process  thus  entails  finding  0  to  satisfy 
these  conditions.  For  this  study,  the  process  was  implemented  by 
performing  an  iterative  search  of  logD-space,  relying  upon  the  convexity 
of  the  R/0  curve  of  Figure  6-1  it  insure  rapid  convergence. 


It  is  important  to  note  that  for  eacn  trial  value  of  0,  only  a 
simple  analytical  expression  (equation  6.2  or  6.3)  need  be  computed  and 
tne  result  compared  witn  the  goal  to  see  if  (6.1)  is  satisfied.  If  not, 
a  correction  to  I)  is  applied  and  the  procedure  is  repeated.  Actual 
image  codiny  is  not  necessary  to  find  the  correct  D.  In  addition, 
experience  indicates  that  convergence  occurs  usually  in  two  iterations, 
out  essentially  always  oy  the  third  iteration. 
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7.0  ALGORITHM  EVALUATION 


Algorithm  evaluation  was  conducted  in  two  phases: 

•  Preliminary,  ana 

•  Comprehensive. 

The  preliminary  evaluation  was  aimed  at  investigating  the  relative 
merits  of  the  various  perturbations  of  the  KLT  and  SVD  algorithms 
developed  under  the  effort,  using  a  small  set  of  test  imagery.  Based  on 
this  evaluation,  the  best  members  in  each  catagory  were  selected  and 
more  thoroughly  exercised  against  a  larger  set  of  imagery  to  compare 
their  performance  witn  each  otner  and  with  the  baseline  cosine  and 
Haaamara  algorithms. 

All  algorithms  were  essentially  identical  in  all  ways  except  for 
which  transform  applied.  Thus,  all  had  the  following  features: 

•  Class-adaptive  coefficient  coding, 

•  Empirical  accumulation  of  class-specific  coefficient 

statistics  (except  KLT/P), 

•  Special,  error-free,  "dc“  coding, 

•  Same  intensity  mappings, 

•  Same  blocks  labels  obtained  from  block  classification, 

•  16  X  16  block  size, 

•  Same  block  boundaries. 
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•  Same  fixed  rate  quantizers  (uniform  and  Max),  and 

•  Same  rate  equalization  techniques. 

Algoritnms  were  compared  on  several  bases: 

•  Coding  efficiency: 

—  Mean  square  error  versus  coded  rate 
Mean  absolute  error  versus  coded  rate 
Subjective  perception  of  distortion  in  reconstructed 
image  versus  coded  rate 

Subjective  perception  of  information  in  error  image 
versus  codea  rate 

•  Computation  efficiency: 

Execution  time 

Adaptability  to  hardware  implementation 

The  remainder  of  this  section  is  divided  into  two  subsections,  7.1  which 
summarizes  the  findings  of  the  preliminary  evaluation,  and  7.2  which 
presents  the  results  of  the  comprehensive  evaluation. 

7.1  Preliminary  Evaluation 

Preliminary  evaluation  consisted  of  two  parts: 

•  Determination  of  optimal  class  boundaries,  and 

•  Algorithm  evaluation. 

7.1.1  Optimal  Class  boundaries 

All  algorithms  tested  employed  class-adaptive  coefficient  coding. 

An  important  aspect  of  sucn  algorithms  is  determining  good  class 


definitions.  In  this  study,  a  single  scalar  feature  was  extracted  from 
each  olock  and  used  to  classify  the  block  into  one  of  eight  classes, 
according  to  the  value  of  that  feature.  Specifically,  the  feature  used 
was  olock  a.c.  energy: 

2 

9 

and  the  classifier  took  the  form: 


u(x) 


1 

T 


ij 


ij 


(+  tH 


X  is  laoleo  class  k  IF  tk_^<u(X)<tk. 

This  portion  of  preliminary  evaluation  dealt  with  determining  the  best 

values  tor  'tj. 

I  k| 

Five  types  of  threshold  settings  were  investigated: 

(1)  Uniform  class  population  in  test  image, 

(2)  Uniform  class  population  over  many  images, 

(3)  Uniform  thresholds  in 

(4)  Uniform  thresholds  invTT, 

(5)  Uniform  thresholds  in  log(u). 

Evaluation  consisted  of  exercising  the  baseline  cosine  coding  algorithm 
on  a  particular  GFE  aerial  image,  for  each  of  the  threshold  settings, 
over  a  range  of  compression  rates. 

Comparisons  were  based  on  mean  square  coding  error  (MSE),  on  mean 
absolute  coding  error  (MAE),  and  on  subjective  comparisons  of  original, 
cooed,  and  error  images.  The  conclusion  was  that,  based  on  MSE,  (4) 
performed  best  with  (1)  a  close  second.  Based  on  MAE,  (4)  again 
performed  best,  but  this  time  both  (1)  and  (2)  were  close.  Sub¬ 
jectively,  (4)  was  judged  to  produce  the  best  results  with  (5)  a  close 
second. 


Altogetner,  (4)  was  selected  as  Dest.  Tnerefore,  class  bounoaries 
uniform  in  VjP  (uniform  in  bloc*  RMS  value)  were  used  for  all  algorithms 
during  tne  remainder  of  the  evaluations. 

7.1.2  Preliminary  Algorithm  Evaluation 

The  following  algorithms  were  compared  under  preliminary  evaluation: 


•  COS 


20  cosine  transform  (baseline) 


•  HAO 


:  20  Hadamaro  transform  (baseline) 


KLT/P  :  Class-adaptive  KLT  using  predicted 

coefficient  statistics 


KLT/E 


Cl  ass -adaptive  KLT  using  empirical 
coefficient  statistics 


SVD/COS 


SVD  using  10  cosine  transform  of  singular 
vectors 


•  SVD) HAD  :  SVD  using  ID  Hadamard  transform  of  singular 

vectors 


•  SVD/COS/RO  :  Same  as  SVD/COS  but  with  reordering 

enhancement 


SVD/ HAD/ RO  :  Same  as  SVD/HAD  but  with  reordering 
enhancement 


SVD/COS/ORTH  :  Same  as  SVD/COS  but  with  orthogonal  expansion 
enhancement 


•  SVD/COS/RP  :  Same  as  SVD/COS  Dut  with  repolarization 

enhancement 
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Each  algorithm  was  applied  to  a  GFE  aerial  image  of  an  airfiela  at  three 
coded  rates,  0.5,  1.0,  and  1.5  Dits  per  pixel  (bpp). 


Mean  Squared  Error 

Trie  first  eight  algorithms  were  evaluated  first.  The  last  two  were 
later  enhancements  subsequently  evaluated.  Figure  7-1  shows  the  mean 
square  cooing  error  plots  for  the  first  eight  algorithms.  Each  curve  is 
a  piecewise  linear  fit  to  the  three  evaluation  points  (0.5,  1.0,  and  1.5 
bpp).  Curves  located  towards  the  bottom  of  the  plot  indicate  better 
coding  efficiency  than  do  curves  located  towards  the  top. 

From  this  figure,  the  KLT/E  and  COS  algorithms  are  seen  to  perform 
best;  they  add  the  least  amount  of  mean  square  coding  error  of  any 
algorithm.  Since  their  curves  essentially  overlap,  it  is  not  possible 
to  judge  relative  superiority  of  one  of  thes.  over  the  other  on  the 
basis  of  MSE;  however,  both  are  markedly  superior  to  the  remaining  six 
algorithms. 

The  figure  also  shows  the  poorest  performance  is  retained  for 
algorithms  employing  the  Hadamard  transform.  It  is  illuminating  to 
compare  tne  COS  ano  HAD  curves  to  see  how  mucn  coding  efficiency  one 
gives  up  to  gain  the  computational  efficiency  provided  by  the  HAD 
algorithm.  For  example,  the  figure  indicates  that  tne  COS  performs  as 
well  at  0.5  bpp  as  the  HAD  does  at  twice  that  rate,  1.0  bpp.  This  same 
effect  is  in  evidence  in  comparing  the  various  SVD/C0S  algorithms  with 
the  various  SVD/HAD  algorithms. 

Of  particular  relevance  to  this  study,  the  figure  shows  that  the 
various  SVO  algorithms  perform  significantly  worse  than  the  COS  or  KLT/E 
algorithms.  The  SVD/COS  algorithms  are  superior  to  the  baseline  HAO 
algorithm  but  are  nonetheless  inferior  to  the  baseline  COS  and  the  KLT/E. 

Aoditiona) ly,  the  figure  also  indicates  that  using  empirically 
determined  statistics  in  the  KLT  is  superior  to  using  predicted 
statistics  oased  on  a  separable  covariance  model.  This  indicates  that 
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igure  7-1.  Coding  Algorithm  Comparisons  MSE  versus 
Rate  Airfield  Image 


tnis  moael  is  not  particularly  good  at  characterizing  image  correlation, 
even  though  use  of  the  KLT  transform  operators  derived  under  tnis  model 
ao  yield  good  results  when  empirical  statistics  are  provided  to  the 
coder. 

Finally,  the  singular  value/vector  reordering  enhancement  is  seen 
to  slightly  degrade,  rather  than  improve,  both  SVD/COS  and  SVD/HAD 
performance.  Tnis  indicates  that  ordering  on  the  basis  of  singular 
value  size  produces  smaller  dispersions  in  singular  vector  coefficient 
statistics  tnan  does  ordering  on  the  basis  of  singular  vector  frequency 
(or  sequency)  content. 

Mean  Apsolute  Error 

Very  similar  relative  algorithm  performance  is  indicated  by  the 
mean  absolute  error  curves  of  Figure  7-2.  These  curves  show  the 
intensity  of  the  error  image  obtained  at  each  experiment  point.  Since, 
on  the  whole,  the  curves  occupy  tne  same  relative  positions  in  Figure 
7-2  as  tney  do  in  Figure  7-1,  similar  conclusions  on  relative 
performance  are  drawn. 

Subjective  Evaluation 

Figure  7-3  illustrates  a  GFE  aerial  photograph  made  available  for 
algorithm  testing.  The  256  X  256  subset  shown  was  extracted  and  used 
for  preliminary  evaluation.  The  results  of  Figures  7-1  and  7-2  were 
obtained  by  processing  this  subset.  Additional,  subjective  comparisons 
of  the  algorithms  were  also  performed. 

Figures  7-4  tnrough  7-8  show  the  reconstructed  images 
ootained  oy  applying  the  various  algorithms  at  several  bit  rates. 

Figures  7-9  and  7-10  show  error  images  for  the  eight  algorithms  operated 
at  1.0  opp.  Several  observations  were  obtained  by  examining  tnese 
pictures. 
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Figure  7-2. 


Coding  Algorithm  Comparison  MAE  versus  Rate 


Airfield  Image 
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Figure  7.3 
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Figure  7.4 


Figure  7.6 


Figure  7.7 
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Figure  7.8 


M-i  rreaictlve  trror  l.Obpp  KLT  Empirical  Error  l.Obpp 

figure  7.9 


First,  the  COS  and  KLT/E  algorithms  are  subjectively  best.  They 
are  indistinguishible  in  terms  of  subjective  performance  and  yield  the 
best  reconstructed  images,  both  in  terms  of  edge  crispness  and 
continuity  and  faithful  texture  rendition.  The  COS  and  KLT/E  produce 
the  lowest  brightness  error  images  with  the  least  structure  in  them. 
They  also  produce  the  smallest  number  of  very  bright  error  pixels. 
Reconstructed  images  retain  all  important  detail  at  all  three  bit 
rates:  1.5,  1.0,  and  0.5  bpp. 

Second,  the  SVD/COS  algorithm  performs  subjectively  well.  It  is 
subjectively  indistinguishable  from  the  SVD/COS/RO  algorithm.  It 
renders  detail  well  at  1.5  and  1.0  bpp,  although  it  is  inferior  to  both 
COS  and  KLT/E  at  these  rates.  This  inferiority  is  evidenced  in  several 
categories,  including: 

•  Crispness  of  edges  in  reconstructed  images, 

•  Rendition  of  texture  in  reconstructed  images,  and 

t  Intensity  of  error  images. 

On  the  other  hand,  the  SVD/COS  is  approximately  equivalent  to  COS  and 
KLT/E  in  terms  of  the  structure  in  the  error  images  and  the  number  of 
very  bright  error  image  pixels. 

Third,  the  KLT/P  is,  subjectively,  considerably  inferior  to  all 
three  of  the  COS,  KLT/E,  and  the  SVD/COS  (and  SVD/COS/RO)  algorithms. 
This  inferiority  is  consistent  across  all  bit  rates  and  is  refelcted  in 
all  the  subjective  measures  just  discussed. 

Last,  the  HAD,  SVD/HAD  and  SVD/HAD/RO  are  worst  in  all  catagories. 
Especially  noticeable  is  the  error  image  structure,  which  appears  to 
accurately  capture  the  essential  structural  information  in  the  original 
image.  Since  good  performance  dictates  having  uncorrelated  error  and 
reconstructed  imagery,  this  is  an  indication  of  poor  coding  performance. 
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Orthodonal  Expansion  ano  Repolarization  Enhancements 

Due  to  the  apparent  poor  showing  of  the  SVD-baseo  algorithms  with 
respect  to  tne  COS  baseline,  the  orthogonal  expansion  and  repolarization 
enhancements  were  developed  as  a  final  attempt  at  optimizing  the  SVD 
coder.  These  enhancements  representea  an  additional  effort  to  exploit 
the  last  possible  sources  of  reaundancy  in  the  SVD  decomposition  in 
order  to  extract  as  nigh  a  oegree  of  performance  as  possible. 

Both  enhancements  were  evaluated,  and  neither  improved  the  SVD/COS 
performance  markedly.  The  SV0/C0S/0RTH  was  slightly  better  in  terms  of 
MSE  and  MAE,  but  the  difference  was  similar  to  the  small  difference 
between  the  KLT/E  and  COS  algorithms,  and  no  subjective  difference  was 
apparent.  Tne  repolarization  enhancement  produced  similar  results,  but 
in  its  case  tne  objective  performance  was  slightly  poorer,  while  the 
subjective  performance  was  indistinguishable. 

Conclusions  of  Preliminary  Algorithm  Evaluation 

The  following  points  summarize  the  preliminary  evaluation  results: 

•  COS  is  superior  to  SVD/COS, 

•  KLT/E  and  COS  are  tied, 

•  SVO/COS  is  superior  to  SVD/HAD, 

•  KLT/E  is  superior  to  KLT/P, 

•  HAD  performed  worst, 

•  Reordering  does  not  improve  SVD  coding, 

•  Orthogonal  expansion  does  not  markedly  improve  SVD  coding,  and 


Repolarization  does  not  improve  SVD  coding. 


daseo  on  tnese  findings,  the  best  algorithm  in  each  catagory  can  be 
identified: 


• 

Baseline 

:  COS 

• 

KLT 

:  KLT/E 

• 

SVD 

:  SVD/ COS 

7.2  Comprehensive  Evaluation 

This  subsection  reports  on  the  results  of  the  comprehensive 
algorithm  evaluation.  Four  algorithms  were  applied: 


• 

COS  : 

cosine  baseline. 

• 

HAD 

Hadamard  baseline. 

• 

SVD  : 

SVD/COS  algorithm,  and 

• 

KLT  : 

KLT/E  algorithm. 

These  algorithms  were  applied  to  four  test  images,  eacn  one  a  256  X  256 
subset  of  a  larger  GFE  image.  These  images  were: 

•  Visible  airfield, 

•  Visible  harbor  scene, 

•  Infrared  airfield,  and 

•  SAR  airfield. 
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Each  algorithm  was  pplied  to  each  image  at  three  different  oit  rates: 

•  Visible  airfield  :  0.5,  1.0,  1.5  bpp 

•  Visible  harbor  :  0.5,  1.0,  1.5  bpp, 

•  Infrared  airfield  :  0.25,  0.5,  1.0  bpp,  and 

•  SAR  airfield  :  0.5,  1.0,  1.5  bpp. 

The  total  number  of  image  encodings/decodings  was  thus  48. 

Figures  7-11  througn  7-26  show  the  original  and  coded  images 
involved  in  the  evaluation.  Figures  7-11  through  7-14  pertain  to  the 
visible  airfield  image.  Figures  7-15  through  7-18  to  the  harbor  scene. 
Figures  7-19  through  7-22  to  the  IR  image,  and  Figures  7-23  through  7-26 
to  tne  SAR  image. 

Figures  7-27  through  7-34  give  a  summary  of  objective  coding 
performance  measures  in  terms  of  MSE  and  MAE  versus  coding  rate. 

These  results  can  be  summarized  as  follows: 

•  MSE,  MAE,  and  subjective  evaluation  yield  same  conclusions, 

•  COS  and  KLT  perform  equally  well  and  best, 

•  Next  is  SVD, 

•  Poorest  is  HAD. 

In  terms  of  computational  load,  the  following  rank  ordering  applied: 

•  HAD  is  lowest, 

•  COS  is  next, 
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KLT  is  third,  and 


• 

•  SVD  is  significantly  highest. 

In  terms  of  extendabi 1 1 ty  to  special  purpose  haroware,  both  tne  COS 
ana  HAD  are  good  candidates  owing  to  tneir  "fast"  algorithms.  In  fact, 
special  purpose  hardware  for  10  versions  of  these  transforms  already 
exists.  The  KLT  could  be  implemented  in  special  purpose  hardware,  but 
it  would  be  significantly  more  cumbersome  due  to  the  requirement  to 
perform  full  matrix  multiplications.  A  special  purpose  hardware 
implementation  of  the  SVD  is  not  so  practical,  owing  to  its  reliance 
upon  an  iterative  procedure  which  is  not  guaranteed  to  converge  in  a 
finite  number  of  steps. 

In  summary,  taken  together,  these  observations  point  to  the 
conclusion  that  the  cosine  transform  coder  is  the  best  algorithm  amongst 
those  tested.  In  certain  applications  where  computational  efficiency  is 
paramount,  the  Hadamard  algorithm  may  be  warranted.  However,  neither 
coding  nor  computational  efficiency  seems  to  favor  the  KLT  or  SVD  in  any 
case. 
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Figure  7.11 


KLT  l.Obpp  0.53XMSE  SVD  l.Obpp  0.93%MSE 

Figure  7.13 


KIT  0.5bpp  1 . 30%M$l  SVD  0.5bpp  1.94XHSE 

Figure  7.14 


Figure  7.15 


SVD  0.5bpp  3.16SMSE 


Figure  7.19 


SVD  . 25bpp  2.34XMSE 


Figure  7.23 


KLT  l.Obpp  6.42XMSE  SVD  l.Obpp  9.73SMSE 


KLT  0. 5bpp  13.0%M5t  SVD  0.5bpp  1 7 . 7%MSE 

Figure  7.25 


KLT  . 25bpp  21 . 3%MSE  SVD  ,25bpp  29 . 2%MSE 

Figure  7.26 
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Figure  7-27.  Coding  Algorithm  Comparisons; 
MSE  versus  Rate,  Airfield  Image 
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Figure  7-31.  Coding  Algorithm  Comparisons 
MAE  versus  Rate,  Airfield  Image 
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8.0  CONCLUSIONS 


The  principal  conclusion  of  this  study  is  that  among  the  algorithms 
tested,  the  cosine  transform  algorithm  appears  to  be  the  Pest  performer 
in  terms  of  coding  efficiency.  Computational  efficiency  points  to  the 
Haaamara  algoritnm  as  a  better  choice,  but  it  suffers  a  significant 
performance  degradation  compared  to  cosine  as  a  price.  The  SVD  based 
algorithms  displayeo  coding  efficiency  intermediate  to  the  cosine  and 
Hadamard,  but  computational  efficiency  worse  than  both.  Because  special 
purpose  hardware  can  be  used  to  implement  it  efficiently,  the  cosine 
approach  appears  best  for  applications  requiring  the  highest  degree  of 
compression  with  the  smallest  coding  distortion. 

More  generally,  results  point  to  the  success  of  compressing  various 
eignt-Dit  images  down  to  at  least  1.0  bit  per  pixel  using  the  oetter 
transform  technques.  In  several  cases,  good  performance  down  to  0.5 
Dits  per  pixel  was  also  observed.  All  transform  coders  performed  well 
at  1.5  bits  per  pixel. 

Significantly,  results  demonstratea  that  although  the  singular 
value  decomposition  produces  extremely  high  energy  compaction  into  a 
small  number  of  singular  values  by  virtue  of  its  being  tailored  to  the 
image  data,  the  price  of  also  having  to  code  singular  vectors  renders 
the  approach  less  efficient  overall  than  either  the  tai lored-to-c lass 
KIT  or  the  fixed  cosine  approaches.  That  this  observation  was  constant 
over  an  assortment  of  techniques  developed  to  minimize  the  bandwidth 
required  for  singular  vector  coding  suggests  that  this  conclusion  is 
robust  and  that  the  SVD  is  inherently  inferior  for  image  coding 
applications.  In  addition,  since  the  Karhunen-Loeve  approach  yielded 
performance  results  comparable  to  the  cosine  transform,  it  can  be 
concluded  that  it  is  the  tailoring  of  the  coefficient  coding  process, 
and  not  the  tailoring  of  the  transform,  which  is  most  important  in  image 
coding. 
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All  of  tne  algorithms  tested  achieved  tailoring  of  the  coefficient 
coding  process  via  class-adaptivivity,  insuring  the  allocation  of 
bandwiotn  to  portions  of  an  image  where  most  required.  In  addition, 
these  algorithms  distribute  bandwidth  within  blocks  to  the  most 
important  information  that  that  block  contains. 

Such  adaptivity  ensures  robustness  and  the  capability  to  deal  with 
highly  non-s tat  ionary  imagery.  The  price  is  that  rate  equalization  is 
required  to  achieve  target  global  compression  rates.  In  this  study,  an 
approach  was  employed  that  guaranteed  meeting  the  specified  target  rate 
through  a  process  of  predictive  rate  equalization,  which  was  based  on 
dIock  class  populations  and  class-specific  statistics  and  which  avoided 
trial  ana  error  coding. 
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APPENDIX  A 


DERIVATION  OF  THE  SEPARABLE  KARHUNEN-LOEVE  TRANSFORMATION 
AND  ASSOCIATED  STATISTICS 

In  this  appendix  the  separable  KLT  is  derived  and  the  predicted 
statistics  of  the  resulting  KLT  coefficients  are  determined  from  the 
block  mean  and  row  and  column  covariance  matrices. 

A.l  Model 


A  block  X  is  assumed  to  possess  a  separable  covariance  function. 
Such  a  situation  can  be  modeled  by  assuming  that  block  X  is  generated  by 
an  outer  product  matrix  multiplication  on  a  zero-mean,  stationary  white 
matrix: 

X  =  HWG1  +  X  (A.l) 


where  •  H  and  G  are  normalized  so  that 
tr  Hl  H  =  n 

tr  G^  G  =  n  (A. 2) 

? 

•  W  is  an  n  x  n  matrix  of  a  variance,  uncorrelated,  zero 
mean  random  variables  W  =  [w^],  i.e. 

E*ij  "  0 

E  w?j  =  o2  (A. 3) 

E  wij  wpq  =  0  for  (p,q )  *  (i,j),  and 

•  The  n  x  n  matrix  7  is  the  mean'  of  the  block  X. 


r 


Figure  A-l  illustrates  the  assumed  model.  That  such  a  model 
results  in  a  separable  covariance  can  be  verified  by  determining  an 
expression  for  the  covariance  of  pixels  x.^  and  xpp  in  X. 


=  E[e*(X  -  X)  e.]  [eJ(X  -  X)  e^ 

(Here  e-  denotes  the  unit  vector  with  ith 
element  1  and  the  rest  0) 

=  E[e*  HW6t  £j]  [ej  GWt  H1  ep] 

=  e1  HE  {WG1  e.  e]j  GW1}  H1  ep 

=  e1  H  [Trace  (Gl  ^  ej  G)  02  I]  H1  ep 

=  e^  HH1  e  *  o2  ‘  Traced1  e-  e1  G) 

— i  -J  -q 

»  °2  (ej  HH1  Sp)  GGl  ej 

=  (i,p)  ’  CH  ( j ,q)  (A. 4) 

where  Cv  (i,p)  =  a  '  e1  HH1^,  and 
CH  (j,q)  =  o  •  e1  GG1  eq. 


Since  c(i,j;  p,q)  can  be  written  as  the  product  of  two  functions  each 
separately  dependent  upon  vertical  and  horizontal  pixel  displacement, 
equation  (A.l)  is  seen  to  model  a  block  with  a  separable  covariance 
function. 
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Figure  A-l.  Separable  Covariance  Block  Model 


A. 2  KLT  Derivation 


The  objective  is  to  find  a  separable,  unitary  transformation  from  X 
into  another  array  Z: 

Z  =  Ul  XV,  (A. 5) 

such  that  the  z-.,  in  Z  are  uncorrelated  and  have  maximum  energy 
* 

compaction  into  the  upper  left-hand  corner  of  Z.  Note  that  because  l) 
and  V  are  unitary: 

X  =  UZV\  (A. 6) 


i.e.,  X  is  recoverable  from  Z  via  the  inverse  unitary  operation. 

Another  way  of  writing  (A. 6)  is: 

n 

*  =  23  zi i  ui  v^  (A. 7) 

i  ,j=l  J  J 

where  u-  is  the  ith  column  of  U  and  v.  is  the  jth  column  of  V.  X  is 
*  J  f 

thus  a  weighted  sum  of  basis  blocks  u-  v.. 

•  j 


According  to  Shannon,  optimum  coding  dictates  selecting  U  and  V 
such  that  the  z  •  j  are  uncorrelated.  The  coefficient  z-^  can  be 
expressed  as: 


'ij 


-  x  k 


Therefore 


(A. 8) 


E  Zij 


— i  X  — j  —  Zij 


(A. 9) 


A-4 


and 


E(zij  -  zij)  (zk1  -  *kl> 

=  E  [u*  (x  -  x)  v.]  [v*  (x  -  Y)1  u^] 

-  E  [u*  (HWG*)  Vj]  [v*  GW1  Ht  u^] 

=  u*  HE  j WGl  v .  v*  GW1 1  H1 
=  u \  H  [Trace  (gV  v*  G)  o2  I]  H*  ^ 

=  (u*  HH*  u^)  '  a2  Trace  (G1  v^  G) 


=  O2  (U* 

HH*  u^) 

(vj  GG1  v  1 ) 

(A. 10) 

Consequently,  the  z-. 

j  will  be 

uncorrelated  if 

U*  HH1  u^  =  0 

for  i 

*  k 

v j  GGl  v1  =  0 

for  j 

*  1 

(A. 11) 

i.e.,  if  U  and  V  are  the  matrices  that  diagonalize  HH*  and  GG* 
respectively.  These  latter  matrices  are  related  to  the  row  and  column 
correlations  in  the  block  X: 


E  J(x  -  x)  (x  -  7)*' 


E  |HWGt  GWfc  H1 j 
HE  jwG*  GW  |  H* 

H  O2  '  Trace  (G1  G)  I  *  Hl 


=  o2  ‘  n  •  HH* 
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and 


E  J(x  -  M)1  (x  -  M){  =  aZ  •  n  •  GG* 


Thus 

HHl  =  ~2~  Crow 
o  n 

and 

GG*  =  _1_  Cco1 
o^n 

where  Crow  and  Cco^  are  the  row  and  column  covariance  matrices  of 
X,  E ( x  -  "x)  (x  -  7)t  and  E  (x  -  (x  -  T). 


The  U  and  V  matrices  can  therefore  be  obtained  by  solving  the 
following  eigenvector/eigenvalue  problems: 


where 


A 

A 


row 

col 


Diag  (x*,  xij,...,  x^),  X*  2  0 
Oiag  (xj,  x^,...,  xj),  xj  *  0 


Because  Crow  and  Cco^  are  positive  semi-definite  matrices,  both  U 
and  V  can  be  found  which  are  orthogonal. 
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(A. 12) 


(A. 13) 


A. 3  Predicted  KLT  Coefficient  Statistics 


In  order  to  code  the  Zy,  the  corresponding  mean  and  variance  are 
required.  Using  the  U  and  V  calculated  above,  the  mean  is  given  by 
(A. 9),  and  the  variance  by  the  specialization  of  (A. 10)  to  the  case 
where  k  =  i  and  1  =  j: 


E  (zij  "  Zij* 


a2  [u*  HH*  Hi]  [V*  GG1"  Vj] 

o2  [~i  {  ^  CroW)  ^  C4^Ccol)^ 


2  »r  ,c 
=  0  xi  xj 


(A. 13) 


For  the  greatest  energy  compaction  into  the  z^  with  the  smallest 
indices,  we  impose  an  ordering  on  the  columns  of  U  and  V  such  that 


r  r 
x!  -  x2  - 


2  x. 


and 


X1  -  x2  -•*’ 


(A. 16) 


The  mean  and  variance  of  the  z.^  can  be  compactly  summarized  by  a 


matrix  form  of  equations  (A. 9)  and  (A. 15): 
Z  *  C^j]  -  [E  Zij]  =  U*  XV 


(A. 17) 


and 


ij 


lot  A  -  [E<z„  -  1-  J2]  =  ir  x 


2  . r  ,c 


(A. 18) 


where 


r  t 1 

r  i 

1 

i 

• 

.  £  - 

• 

Ac 

n  . 

L  n  J 
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APPENDIX  B 


HOMOGENIZING  IN  TRANSFORM  COEFFICIENT  SPACE 

In  order  to  smooth  out  structural  artifacts  induced  in  coefficient 
sample  statistics  due  to  too  small  a  sample  space,  the  sample  space  is 
artificially  expanded  by  the  addition  of  new  members  synthesized  from 
original  members. 

In  particular,  statistics  of  the  coefficient  arrays 
Z  =  U*  XV 

are  required,  where  X  is  an  image  block  and  U  and  V  are  specified 
unitary  matrices.  The  sample  space  of  Z's  is 

=  jz:  Z  =  XV,  Xe)ij 

where  ^  is  the  sample  space  of  image  blocks  X,  consisting  of  all  m  x  n 
blocks  X  obtained  by  partitioning  the  designated  images  (often  m  =  n). 

The  sample  space  is  expanded  to  by  expanding  ^(_  to  . 

This  latter  expansion  is  obtained  by  including  the  following  blocks 
in 


For  al  1  X  in 

•  Original:  X 

t  Horizontal  Fli 

•  Vertical  Flip: 

t  Double  Flip: 


*  [XU] 


p: 


x  =  t*i,nn-jj 


X  =  [Vl-i,jJ 


XHV  =  [x 


m+l-i ,n+I-j 


] 


For  square  case  (m  =  n),  also  include: 


V 


B.2  ~<~umption 


We  will  assume  that  U  and  V  have  special  symmetry  properties: 


Fm  U  =  UJm  ,  Fn  V 


VJ'1 


where  Jn,  are  square  matrices  of  the  form 


J  = 


1 


-1 


This  property  means  that  the  1st,  3rd,  etc.  columns  of  U  are  symmetric 
about  their  midpoint,  and  that  the  2nd,  4th,  etc.  are  anti -symmetric. 
The  following  transforms  have  this  property: 

•  Cosine 


•  Sine 

•  Hadamard 

•  Slant 

•  Karhunen-Loeve  when  U  and  V  are  based  on  covariance  matrices 
symmetric  about  the  ortho-diagonal. 

B.3  fb  from  directly 


H  V  HV 

Based  on  all  of  this,  the  coefficient  arrays  Z  ,  Z  and  Z 
H  V  HV 

obtained  from  X  ,  X  and  X  can  be  predicted  from  Z  alone: 


U1  XH  V  =  (Jt(XFn )  V  *  Ul  X(Fn  V)  =  U1 


U*  X(VJn)  *  U 1  XVJn 


Ur' 
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t..  . 


r 


Zv  -  U*  XV  V  «  Ut(Fm  X)  V  -  (Ul  Fm)  XV 

-  (Ul  Fmt)  XV  -  (Fm  U)1  XV 

-  (UJ1")1  XV  *  (Jnt  Ul)  XV 


Jm  Z 


ZHV  -  Om  ZJn 


In  the  square  case,  m  *  n,  the  coefficient  array  ZTrans  resulting 


from  VJt  X^  V  is  not  necessarily  the  transpose  of  Z  =>  llt  XV.  In 
cases  where  it's  not,  the  following  coefficient  arrays  are  added 


uf 


,  Trans 


7Trans 


H 


zTrans  j 


7Trans 


JZ 


Trans 


7Trans 


VH 


JZTrans  j 


In  those  square  cases  where  D  «  V,  the  situation  is  simpler  since 
jjrans  m  2*-  and  the  following  arrays  —  all  obtainable  from  Z 
directly  —  are  added  to 

7t 


Z*  0 


JZ1 


JZ1  J 
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B.4  Statistics 


Mean: 


The  mean  array  Z  obtained  by  averaging  over  all 

elements  of  fij  can  be  obtained  by  averaging  the 

following  over  all  members  Z  of  /l (and  for  the  second 
^Trans*.  0 


case  ZTrans  of  /frans) 


m  t  n: 


I[Z  +  zv  +  ZH  +  ZHV] 


=  +  zn  + 


■1, 


m  »  n;  U  £  V:  \ 


’I, 


u  o' 


(Z  ♦  ZTrans) 


m  =  n; 


V:  2 


‘I , 


(Z  +  IZ) 


•lr 


Mean 

Square 

Value: 


The  mean  square  value  array  of  the  set  /?'  can  be 
directly  obtained  by  element-wise  mean  square 
averaging  of  the  following  over  al 
(and  ZTrans  of  /£rans  for  the  2nd 

m  +  n:  Z 


1  members  Z  of 
case) . 


1 


•  m  =  n,  U  b  V:  Z,  ZTrans 

•  m  =  n,  U  =  V:  Z,Z^ 
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APPENDIX  C 


OPTIMAL  CODER  ASSIGNMENTS  FOR  KLT  COEFFICIENTS 


We  pose  the  coder  assignment  problem  as  one  of  minimizing  the  total 
coding  error  Erygy  subject  to  not  exceeding  a  maximum  bit  allocation 


B, 


1^.  To  be  specific,  suppose  there  are  L  blocks  in  an  image.  Then  the 
total  mean  square  error  is  the  sum  of  each  individual  block's  error: 


ErT0T  =  E  E { 1 )  (C.l) 

and  the  toal  bit  allocation  Bjqj  is  equal  to  the  sum  of  the  individual 
blocks'  bit  allocations  B(l): 

Btot  =  E  B( 1 )  (C.2) 

The  optimization  problem  can  be  posed  as: 


minimize  ErTQy 


(C.  3) 


subject  to  Btot  s  Bm 

This  is  most  easily  approached  as  a  Lagrange  multiplier  problem  in  which 
the  functional  J  is  formed: 


0  =  ErT0T  +  X^BT0T  ‘  BMAX^ 

=  E  Er(  1 )  +  A[  EB(1)  -  Bmax] 


(C.4) 


The  optimum  values  of  B(l)  are  found  by  taking  partial  derivatives  of  J 
and'setting  them  to  zero: 


30 

WT) 


A  =  0  ,  1  =  ,  —  ,L 


(C.5) 


C-l 


A 


This  yields 


3tr( 1 ) 

3BTTT 


=  -  x,  1  =  1 , • • "L , 


where  the  multiplier  X  is  given  by: 
3  ErT 


X  = 


'TOT 


3  B 


TOT 


(C.6) 


(C.  7) 


which  is  the  (negative)  slope  of  the  overall  coding  error/coding  rat^-  curve. 


Now,  since  X  is  global  and  does  not  depend  upon  the  block  index  1, 
we  see  that  if  X  is  known,  we  have  L  independent  problems: 


2  Er(  1 ) 

2"W 


-X 


(C.8) 


which  means  that  each  block's  bit  allocation  can  be  separately 
determined.  The  key  point  here  is  that  X  provides  global  fidelity  control: 
specifying  X  determines  where  on  the  coding  error/coding  rate  curve  we 
will  operate.  Armed  with  that  information,  each  block's  allocation 
follows  by  solving  (C.8)  for  the  appropriate  1. 

The  problem  of  specifying  the  correct  X  to  insure  constraint 
satisfaction  (Bjqj  s  B^)  is  called  rate  equalization.  It  is  treated 
in  Section  6.  For  present  purposes  we  consider  X  given. 


C. 1  Single  Block  Problem 


To  solve  (C.8),  it  is  necessary  to  expand  both  Er(l)  and  B(l). 

Since  each  block  is  separately  solved,  we  will  drop  the  "1"  argument  for 
notational  simplicity.  Adopting  the  energy  error  measure,  we  have 


Er 


E 

i  >j=l 


E(xij  -  *11> 


(C.9) 


0S  VCR AC 
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where  x.j  is  the  reconstructed  version  of  pixel  x^.  Because  the  KLT  is 
unitary,  this  last  expression  can  be  computed  equally  well  in  the 
transform  domain,  yielding 


Er  =  E(zij  '  ^ij^ 


(C. 10) 


ij=l 


where  is  the  reconstructed  version  of  coefficient  z^. 

Also,  the  total  bit  allocation  for  the  block  can  be  expanded  in 
terms  of  the  bit  allocations  for  each  coefficient,  yielding 

n 


B  = 


E  Bi  j  > 

i  j=l  1J 


(C. 11) 


where  B.j  is  the  bit  allocation  for  z.j.  What  we  ultimately  seek  are 
the  B,  • ' s. 

'  J 

Now,  we  notice  that  (C.8)  is  one  of  the  necessary  conditions 
required  to  solve  the  single-block  Lagrange  multiplier  problem. 


E  E(Z..  -  z.  )2  *  \[  £  B, ,  -  B] 


'U  '  j 


ij 


ij 


(C.  12) 


which  arises  from  wanting  a  solution  to  the  following  constrained 
minimization  problem: 

minimize  Er  =  £  E(z..  * 

'j  (C.  13) 

subject  to  B  =  E  B... 

ij  J 


The  remaining  necessary  conditions  for  the  single-block  Lagrange 
multiplier  problem  are  obtained  by  setting  to  zero  the  partials  of  J' 
w.r.t.  the  B,-: 

*  J 


3  J' 


3E(z^- 


Bu 
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0 


(C.  14) 


or: 


9  E(z. 


11 


-Zil 


3B, 


ij 


(C. 15) 


Thus,  again  the  problem  reduces  in  scope:  Now  we  need  only  solve  bit 
allocations  for  a  single  coefficient  at  a  time. 


C.2  Single  Coefficient  Problem 

~  2 

For  this  we  need  a  relationship  connecting  E(z..j  -  z. .)  and  B... 

I  J  I  J  l  J 

We  make  the  following  assumptions: 

•  E  z.j  =  z^j  =  E  z.jj,  i.e.  the  coder  is  unbiased. 

•  The  random  variables  z.  .  all  share  the  same  form  of  probability 

u  _ 

density  function  (pdf)  with  each  parameterized  by  its  mean  z.. 

2  1 J 

and  variance  a- •  . 

•  J 

•  Quantization  is  performed  by  first  subtracting  Y,  .  from  z-., 

I  J  1  J 

then  dividing  by  o^j,  then  finally  passing  the  result 
through  a  B..  -  bit  (2BiJ  -  level)  unbiased  quantizer. 

*  J 

Reconstruction  re-introduces  the  factor  and  biases  the 
result  by 

These  assumptions  are  all  in  force  for  the  coders  used  in  this  study.  Under 
tb»se  conditions 


E(2u ' 


*  V 


(C. 16) 


where  f(-)  is  a  monotonically  decreasing  positive  function  depending 
upon  the  assumed  pdf  and  the  type  of  quantizer. 
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The  analytical  form  of  f ( B)  is  generally  unwieldly,  and  coirmon 
practice  is  to  employ  a  simpler  curve  fit.  Specifically,  a  fit  of 
the  following  form  is  used  here: 

f(B)  -  b  2'B/a  (C. 17) 

where  b  and  a  are  parameters  tailored  to  various  types  of  pdf's  and 
quantizers.  For  example,  for  the  case  of  Gaussian  pdf’s  and  Max 
quantizers  --  which  we  use  for  KLT  coefficient  coding  --  good  upper- 
bound  values  are  b  =  2.2  and  a  =  0.5.  Good  lower-bound  values  are  b  =  1 
and  a  =  0.5. 


Given  the  fit  (C.17)  and  the  expression  (C.16),  the  necessary 
condition  (C.15)  reduces  to: 


(1n2)  (b/a)  2 

which  yields: 


eij/a 


=  x  , 


Bij  =  3  l0S2 


°i  J 


(A  *  re) 


( C. 18) 


(C.  19) 


For  our  cases  of  interest,  a  =  0.5.  We  also  denote 


D  =  (ofe) 


1/2 


(C.20) 


to  obtain 


Bij 


(C.21, 


This  last  expression  tells  how  to  determine  the  bit  allocation  B .  .  from 

'  v 

the  coefficient  variances  and  the  global  distortion  control  parameter  D. 


venue 
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APPENDIX  D 


HOMOGENIZING  SVD'S  OF  PRETRANSFORMED  BLOCKS 

We  begin  with  XV  =  S,  where  X  is  derived  by  applying  either 
the  cosine  or  Hadamard  transform  to  the  rows  and  columns  of  a  pixel 
block  x.  We  wish  to  find  the  singular  values  and  left  and  right 
singular  vectors  for  homogenized  versions  of  X. 

D.l  Flips 

U  W  LjU 

Corresponding  to  the  set  x,  x  ,  x  ,  x  of  flipped  pixel 
blocks  are  the  following  transform  blocks: 

X,  XJn,  Jm  X,  Jm  XJn 

where  Jn,  Jm  are  the  n  x  n  and  m  x  m  versions  of  the  matrix: 


Note  that  this  matrix  has  the  property  that: 


J-1 

=  J*  «  J. 

Therefore,  if 

U*  XV 

=  s. 

(XJn)  (Jn  V) 

*  s 

(Jm 

U)*  (Jm  X)  V 

*  s 

(Jm  U)1  (Jm  XJn)  (Jn  V) 

-  s 
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Therefore,  to  homogenize  the  SVD  wrt  flips,  average  over  the  following: 

•  singular  values:  S  only 

•  left  singular  vectors:  U,  Jm  II 

•  right  singular  vectors:  V,  Jn  V 
D.2  Transpose  (Rotation  and  Flip) 

Since  the  transform  of  x  is  X,  the  following  relationship: 

U1  XV  =  S  =»•  V1  Xt  U  =  S 
tells  us  to  homogenize  wrt  transposes  by 

•  singular  values:  S  only 

•  left  singular  vectors:  U  and  V 

•  right  singular  vectors:  U  and  V 
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APPENOIX  E 


SVD  ORTHOGONAL  EXPANSION  COEFFICIENT  STATISTICS 

This  appendix  constitutes  an  analysis  to  determine  the  first  two 
moments  of  the  coefficients  {ou j }  from  the  singular  vector  statistics 

{iJij}  and  {“^}- 
E.l  Problem 


At  the  jth  step,  the  orthogonal  singular  vectors  ^  have 

been  quantized,  transmitted,  and  decoded  as  %*• • • These 
vectors  are  then  used  to  find  an  orthogonal  basis  for  the  jth  step 

A 

(assumes  the  u.'s  are  orthogonal). 


BJ 


M  -N  -N  .  N  kN  ,  N 1 

L-r  -j’  -j+i . HiiJ 


where 


u, 


bi 


=  e. 


Ji 

,  -  I  <4  *>  *  -  S  >4  t 


i-1 

E 

i=j 


tS 


i-bi: 


^  • 


Then  express  ui  as 
i-1 


j-1  m 

=  E  “n  +  E  kf 

1  1*1  11  1  l=j  11  1 


We  wish  to  find  the  statistics  a,,  and  ~T  of  these  coefficients  in 

li  a,. 

order  to  encode  them. 


L 


E-l 


I 


a1j for  1 1 1 j  1 : 


u 


4  Z  ■  4  4 


*  4  •  4> 


t  ^ 
-u  •  u. 
-J  -1 


'V. 

u  =  u  -  u 


Eaij 

=  -E  (up  E  (uj) 

= 

0 

Eaij 

"  E  («J  ip2 

■  {(35Sf* 

>  ( 

Hj  4 ) 

i 

1 

■  trE  (gf 

)• 

e  (a, 

4> 

'  tr  dia5  (4i) 

• 

diag  ( 

Z) 

in  — s- 

v  %n2  ~T 

"  h  Uki  ^ 

d1 


£ 

k-l 


z 


\  'VM 

(This  uses  in  place  of  u. ,  i.e., 
uses  |u^|  *  1  in  order  to  obtain  a 

linear-in-d  expression  for  a?j.) 


E.3  Case  2 


o 

-a. 

— ». 

-*» 

o 

i  ^  j 

II 

J  •»-> 

u*  b*. 
-J  -ij 

j-1  1-1 

,  V  -N  .  -N  .  V  (.N>  hN 

Sij  '  *1  -  ft  “n  *  *  *  ft  ‘tij’i 


Ws 

i4n 

i 

Wi 


ij 

r  J-1 

i-1 

4 

ii  -  un 

L  1  1=1  11 

:ji  *  g  <>. 

r 

j-i 

i-1 

_  Ui  j 

V  -N 

-  2-r  Un 

1  =  1  1  ' 

alj  '  P5  (^lj} 

For  simplification  of  notation,  we  will  henceforth  denote  by  b_. , 
notationally  supressing  the  dependence  of  b's  on  the  stage  j. 


! 
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Neglecting  all  higher  order  terms,  this  can  be  simplified  to: 


Ea?  =  t.  +  t. 

i  ij  ij 


Summary 


If 

hF 

i 

E 

1*1 

~S~ 

uli 

.  i  <  j 

t  r. 

V  T*1 

,  i  2  j 

+ 

h  “*'J 

•  '  ■  f. 


APPENDIX  F 


SVD  COEFFICIENT  CODING:  BIT  ALLOCATION 

SVD  block  reconstruction  error  is  given  by: 

% 

X  =  X  -  X 


3  E  X-j  -  E  X^  (X  =ZIx1  is  inverse  SVD,  X^  =  s1  , 

^  where  B1  »  u,  v}) 

■  E  (x1  -  x,) 

*  £  (si  Bi  '  *i  Bi} 


a  E  [s1  B1  -  (s1  -  S1)(B1  -  B^)] 

®  E  -  s-j  B-j  +  Sj  B.j  +  s^i  B-j  -  s-j  B|} 


r  >  ^  ^  %  1 

[Si  Bi  s i  B i  —  s,  B-j ] 


a  Ex, 

where 


s1  =  Ith  singular  vector 

B,  =  u,  v,  ...  1th  basis  block 

■v 

Si  *  quantization  error  in  s, 

B«|  =  quantization  error  in  B<| 


The  squared  reconstruction  error  is  then 


XX^ 


^  ^  1  'Xj  ^  I  ^ 

Y  S1  B1  +  S1  B1  '  S1  BlJ  sk  Bk  +  sk  Bk  ’  sk  BkJ 


Ev'  r  ^  ^  ^  + 

(  Y  |_sl  sk  B1  Bk  +  S1  sk  B1  Bk 


'v  a,  "Vf 

S1  sk  B1  Bk 


(F.2) 


'Xjf  'I  'Xi  f  “V  "V  'W 

+  S1  sk  B1  Bk  +  S1  sk  B1  Bk  ■  S1  sk  B1  Bk 


'V,  ^  'Xj  f  'Xj  'Xj  ^  'Vf 

'  S1  sk  B1  Bk  "  S1  sk  B1  Bk  +  S1  sk  B1  Bk 


Now  assume: 

•  si »  sj  uncorrelated  for  all  i  f  j 

%  -x, 

•  si*  si  uncorrelated  for  all  1  +  j 

(F.3) 

•  s'. ,  s  -  uncorrelated  for  all  i  +  j 

1  0 

•  B. ,  B.  uncorrelated  for  all  i  f  j 

■  J 

Then  uncorrelated  with  Xj  for  i  j=  j  (assume  Gaussian). 
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For  uncorrelated  3(^'s: 


a»  *V» 

MSE  =  2-  E  tr  X.  X, 

1  1  1 


'X,  ^ 

.*.  let  e-j  =  tr  X-j  X-j 


Then,  supressing  subscripts: 


=  tr 


's2^ 


+  s 


%  %  t 

sBB1- 


s  B  B* 


s  s 


-Vf-  'V,?  t  'v 

B  B  t  s^  B  B  -  s  s  B  Bl 


'V, 

s  s 


if  &  -  s2 


BL  +  s  B 


We  next  want  to  take  the  expected  value.  We  will  apply  this 
theorem: 


Theorem  X^’s  zero  mean,  jointly  Gaussian  *  Exj  Xj  x3  x4 


=  Ex^  X2  Ex^  x^ 


+  Ex^  x3  Exj  x^ 


(F.4) 

(F.5) 


(F.6) 


(F.7) 


+  Exj  x^  Ex£  x3 


The  form  this  takes  for  us  is 


E  sx  s2  B1  B2  =  E  s1  s2  E  Bx  B2  +  E  Sj  B}  E  s2  b| 


+  E  s2  Bj  E  s^  B2 


Note:  We  are  assuming  that  s-j,  s-| ,  B-j,  B^  are  jointly  Gaussian. 

Now,  applying  this  to  each  term  of  the  previous  expression  for 
e  generates  terms  involving: 


r  2  _  'N, 

E  s  ,  Ess,  E  s 


E  B  B\  EBB1,  EBB1 

%  %  %  % 

E  s  B,  EsB,  EsB  and  E  s  B 


We  assume  those  underlined  to  be  zero.  This  results  in: 


|  ~7  x  'Vf-  ^5"  t  'v?’  'V  'vt  ) 

Ee  =  tr  s^  B  Bc  +  s‘  B  B1  +  s^  B  Bl 


where  overbar  indicates  expected  value. 


Now  B  =  B-B  =  uvt-uvt 


=  u  v^  -  (u  -  u)  (v  -  v)t 


t  r  t  ^  t  n^t  'V  'V-1 

=  uv  -[uv  -  UV  -  IJ  v  +  uv] 


UV  +  UV  -  uv 
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(F.8) 


(F.9) 


(F. 10) 


(F. 11) 


•  •  B  B* 


rx  t  %  t  ^  t 

[uv  VU  +  UV  vu 


% 

U  V 


t  ^  M: 
V  u 


+  U  V 


,  'Vt  ^  t 
V  U  +  U  V  V  u 


%t  ^ 

U  V  V  u 


“V  "V+  %  0,+  'Xj  t  %  %t  "v  M-t 

U  V  V  U  -  U  V  V  u  +  u  vl  V  u  ] 


Now  we  want  to  take  tr  {£  {•}  }.  In  doing  so,  we  will  assume: 


Thus: 


% 

•  u,  v  uncorrelated 


u,  v  uncorrelated 


u,  u  uncorrelated 


v,  v  uncorrelated 


_  ,  ^  5ft  .  _  r-  t  t^r  t  r  Vt  'V  _  t  V 

E  {tr  B  B ' }  =  E  u  u  Ev  v  +  Eu  uEv  v-Eu  uEv  v 


_  ^t  vt  r  t  r  ^t  ^  r  vt  r  vt  v 

+  E  u  u  E  v  v  +  E  u  u  Ev  v-Eu  uEv  v 


'Vt  ^  *\*t  t  ^  _  'Vt  ^  ^ 

-  E  ul  li  E  vl  v  -  E  ul  u  E  vl  v  +  E  ul  u  E  v  v 


=  E]u|2  E | v | ^  +  E|u|<  E|vY  +  E|u|‘  E|v 


2  c 2  A  ri?;, 2  r,^|2 


Also,  B  B 


t  , 


(u  vt)  (u  vt)  =  u  vt  V  ufc 


tr  B  B 


t  t 
u  u  v  v 


E  {tr  B  B1}  =  E|u|2  Ejv|2 
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(F.12) 


( F .  1 3) 


(F.14) 


(F. 15) 
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Now: 


E  s2  =  (o2)  f$  (ns) 


o2  =  E  s2  -  (Es)2 


E  u?  =  o2  f  (n  ; 
1  u  u^' 


°ui  ■  E  u?  -  (Eui^2 


E*l"  °v.  %l\) 


°v.  =  E  vi  -  (Evi)2 


(F. 16) 


Finally: 


Ee 


~ 7  t  ^2  'Wt 

=  s  tr  BBt  +  s  tr  BB  +  s  tr  BBZ 


(a2  +  I2)  tr  BB1  +  f$  (n$))  tr  BB1  +  (a*  fs  (ns))  tr 


'W/f- 

BB1 


=  ■  S2  +  o\  (l  +  fs  (ns)))  tr  Sft*  +  a\  f$  (n$)  tr  BB* 


=  [s2  +  °s  i1  +  fs  ("s))J  *  °ui  fu  (nu.} 

+  Wl*<)  fv  (-u  > 


tt2  x  2  s 

+  V 


i  J  J 


?  au.  W*  fV(nv.} 

7  7  7  J  J  J  ■ 


+  a?  fs  ("J  S  {uf  +  )  £  (v.  +  a2  ) 

i  i  j  J  j 


Me  will  be  interested  in  partial  derivatives  of  this  expression  w.r.t 

n_,  n  and  n,  . 
s  u,  Vj 
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Now,  using  the  fits: 


fs  (n)  »  b$  2-n/as 


fu  (n)  =  fy  (n)  =  b  2 


•n/a 


(F.  19) 


we  shall  find  explicit  expressions  for  the  remaining  partials: 


3fs  (n$) 

3nc 


-  I  r1  I  2"ns/as  in  2 
as 


aVnU)> 


3n 


u 


-  2'nu/a  in  2 


( F.20) 


3fv(nv  > 

J 

3n„ 


-  (-)  2"nv/a  in  2 

\  d  / 


Now,  to  find  expressions  for  bit 
problem: 

minimize  MSE 

subject  to  £  |n_  + 

1  1  S1 


allocations,  we  will  solve  the  following 


E  nu  +  L  nv  !  =  Nt  (F.21) 

1  uil  j  jV  T 
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We  will  approach  the  problem  via  Lagrange  multipliers.  That  is,  we  form 
the  functional 


J  =  MSE  +  \[E  n  +  EE  n  +  EE  n  -  nJ  (F.22) 
L  i  S1  li  uil  1  j  vjl  '-I 


A  1  JiJ  3  J 

and  set  the  derivatives  ^ ^ -  equal  to  0. 

S1  uil  vjl 

From  the  previous  expressions  for  Ee  (one  term  of  MSE  =  E^), 

we  have: 


+02  (1  +  b2"nv.  /a) 

vj  J 


(ln2)  ^j2‘ns/as  =  A 


( F.23a) 


[i2  +  c\  (1  +  bs2'ns/as)]  -o^  •  j^E  v*  +  (1  +  b2'nv/a) 


•(ln2)  (|)  2'nui/a  »  X 


(F. 23b) 


|V  +  a*  (1  +  bs2_ns/as)j  •  [E  +  °u.  ^  +  b^"n ui  /a^] 


<  dn2)(|)2-V/a  -  X 

J 


( F .  23c ) 


for  every  term  Ee^  In  rtSE, 
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I 


where  X  is  the  value  of  the  sensitivity  of  the  total  error  to  the  total 
allocation: 


3MSE  . 

Wf  '  ■ 


(F.24) 


To  solve  (F.23)  we  begin  by  noticing  that  the  MSE  in  the  various 
elements  of  are  equal,  and  simi larly ,  that  the  MSE  in  the  various 
elements  of  v^  are  equal.  We  express  this  as: 


?  -%  /a 
°ui  (b2  1  >  =  du 


V  i  =  1,. . .  ,m 


2  *nv  /a 

»vj  <b2  J  >  -  dv 


V  i  =  1,. . . ,m. 


(F.25) 


The  truth  of  these  statements  can  be  established  by  ratioing  F.23b 
(and  F.23c)  for  different  i  (and  j). 

We  now  simplify  (F.23)  by  writing: 


i 


=  s*  +  (1  +  b$2 


-  T2  +  2  +  2  K  ,’nS/aS 

-  s  +  os  +  os  bs  2 


( F. 26a) 


-  7  +  d 


where  d$  is  defined  as  implied. 


«< 
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( F. 26b) 


li  .  E  ♦  o2u  u  .  mV) 
1  1  U1 


•{,  «‘‘‘«/* 


■ 


■  £ 


^*.du 


lu  +  md 


and  similarly: 


r 


v  +  nd. 


Now,  we  introduce  first  order  approximations: 


~7 

#  d$  «  % 


-  7 


•  mdu  «  |u| 


V  «  1 


•  ndy  «  |v|Z  :  Y  =  |?|  =  1 


(F.26c) 


(F.27a) 


(F.27b) 


(f .27c) 
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Therefore,  the  equations  (F.26)  simplify  to: 


d 


s 


(F.28a) 


d 


u 


xa  1 
W  ' 


(F.28b) 


d 


v 


Xa  1 
TnT  ‘  ^ 


(F.28c) 


In  light  of  this,  it  is  convenient  to  solve  (F.25),  and  the  definition 
of  d$  (in  F.26a),  in  terms  of  the  bit  allocations: 


o2  •  b 

ns  ■  *,  uh  ar  ■ 


%  •  b 

n..  =  a  log2  — ^ - 


u 


i 


u 


r2  •  b 

nv^  =  a  log2 


(F.29a) 


(F.29b) 


(F.29c) 
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MISSION 
of 

Rome  Air  Development  Center 

RAVC  plant,  and  executes  research,  de.veZopme.nt,  test  and 
selected  acquisition  programs  in  support  of  Command,  Control 
Communications  and  Intelligence  Ich)  activities.  Technical 
and  engineering  support  uithin  areas  of  technical  competence 
is  provided  to  ESP  Program  0 ibices  IPOs)  and  other  BSD 
elements.  The  principal  technical  mission  atieas  are 
communications ,  electromagnetic  guidance  and  control,  sur- 
veOIance  of,  ground  and  aerospace  objects,  intelligence  data 
collection  and  handling,  information  system  technology, 
ionospheric  propagation,  solid  state  sciences,  microuoave 
physics  and  electronic  reliability,  maintainability  and 
compatibility. 


